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^ In this paper, we consider the finite-state multiple access channel (MAC) with partially cooperative encoders 

and delayed channel state information (CSI). Partial cooperation here is in the sense that the encoders communicate 

with each other through finite-capacity links. The channel states are assumed to be governed by a Markov process. 

^Nj ' Full CSI is assumed at the receiver, while only delayed CSI is available at the transmitters. The capacity region 

of this model is derived by first solving the case of the finite-state MAC with common message. Achievability for 

the common message case is established using rate splitting, multiplexing and simultaneous decoding. Simultaneous 

decoding is crucial here since it circumvents the need to rely on the capacity region's comer points, which becomes 

O I cumbersome as the number of messages to be sent grows. The common message result is then used to derive the 

capacity region for the case with partially cooperating encoders. Next, we apply this general result to the special case 

^ of the Gaussian vector MAC with diagonal channel transfer matrices, which is suitable for modeling, e.g., orthogonal 

CO ' frequency division multiplexing (OFDM)-based communication systems. The capacity region of the Gaussian channel 
00 ' 

^— ^ ■ is presented in terms of a convex optimization problem, which can be solved efficiently using numerical tools. The 

. region is derived by first presenting an outer bound on the general capacity region, and then suggesting a specific 

CO . input distribution that achieves this bound. Finally, numerical results are provided that give valuable insights into the 

, practical implications of optimally using conferencing in order to maximize the transmission rates. 
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Capacity region. Common message. Convex optimization. Cooperative encoders. Delayed CSI, Diagonal vector 
Gaussian Multiple-access channel. Finite-state channel, Fourier-Motzking elimination. Multiple-access channel. 
Simultaneous decoding. 



I. INTRODUCTION 

Time-varying channels and their research have been drawing increasing attention over the past few years. This 
is due to the fact that these channels successfully model wireless communication systems, which constitute the 
most prevalent form of communication today. In a wireless setting, the user's motion and the changes in the 
environment, as well as interference, may lead to temporal changes in the channel quality. The time-varying channel 
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Fig. 1: FSM-MAC with partially cooperative encoders, CSI at the decoder and delayed CSI at the encoders with 
delays c?i and ^2- 



characteristics give rise to the need for channel state information (CSI) estimation. For example, the Long Term 
Evolution (LTE) cellular communication standard uses pilot signals that are transmitted at pre-scheduled time 
intervals and frequency slots in order to estimate the channel's state HI. These estimations are performed at the 
receiver and then commonly fed back to the transmitter. Frequently, the feedback is not instantaneous, which results 
in the fact that the transmitter has access to delayed CSI. In addition, it seems that future wireless communication 
is heading towards user cooperation in order to achieve enhanced performance. In view of the above, we explore 
in this paper the impact of both cooperation and the availability of CSI. We focus on a setting of a finite state 
Markov (two-user) multiple access channel (FSM-MAC), with partially cooperative encoders and delayed CSI, as 
illustrated in Fig. [T]and explained in the following. 

In the communication scenario under discussion, each of the two encoders wishes to send an independent private 
message through a time-varying MAC to the decoder Delayed CSI is assumed to be available at the encoders, 
while full delayless CSI is assumed at the decoder Different users may be subject to different CSI delays. It is 
further assumed that prior to each transmission block, the two encoders are allowed to hold a conference. More 
specifically, it is assumed that the encoders can communicate with each other over noise-free communication Unks 
of given capacities. We restrict the discussion to the case in which the conference held between the encoders is 
independent of the CSI. 

The non-state-dependent MAC with partially cooperative encoders was first introduced by Willems |l2], who also 
derived the capacity region for the discrete memoryless setting. Special cases of this channel model include the case 
in which the encoders are ignorant of each other's messages (i.e., the capacities of the communication links between 
them are both zero), and the case where the encoders fully cooperate (i.e., the capacities of the communication 
links are infinite). The first setting, where no conference is held, corresponds to the classical MAC, for which 
the capacity region was determined by Ahlswede (j3] and Liao H. In contrast, in the second setting, where total 
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cooperation is available, the encoders can act as one by fully sharing their private messages via the conference. 
The capacity region for this case is the part of the first quadrant below the so-called total cooperation line. This 
triangle-shaped region always contains the capacity region for the classical MAC. 

In his proof of achievability for the conferencing MAC in ||2], Willems produced a coding scheme based on the 
capacity region for the MAC with common message. This problem was defined and solved by Slepian and Wolf 
0. Willems showed that in order to achieve the capacity region, the encoders should use the cooperation link to 
share parts of their private messages and then use a coding scheme for the ordinary MAC with common message. 
Although Willems's model allows interactive communication between the encoders, it was shown both in [2|, and 
later in |6|, that the optimality is achieved even in a single round of communication between the encoders (referred 
to as a "pair of simultaneous monologues" in ||2l). 

Additional multiuser settings that involve cooperation between users through communication links of finite 
capacities have been extensively treated in the literature. The study on such channels includes works on the MAC 
Q, m, interference channel ll9l- lfT5l . broadcast channel 1161 . relay channel ifTTl . lITSl . and cellular networks ||T9l . 
A comprehensive survey of cooperation and its role in communication is given in ||20ll . It is important to note, 
however, that in all of the above settings the channel was not assumed to be time-varying. 

Naturally, multiuser settings combining time- varying channels and user cooperation were the next juncture in 
research. A Gaussian fading MAC with cooperating encoders that have access to delayless CSI was considered 
in Ell, II22I . As in our case, these works assumed that the cooperation is allowed only before the CSI becomes 
available at the encoders. A different approach, in which the cooperation occurs after the state information becomes 
available, was treated in |23|. In this work, a MAC with perfect non-causal CSI available at the encoders was 
considered. The coding scheme presented in ||231 used the conferencing in order to share parts of the messages as 
well as the state information regarding the channel's variation in time. 

The notion of modeling time-varying channels as state-dependent channels dates back to Shannon 1241 . In that 
work. Shannon introduced and characterized the capacity of the state-dependent, memoryless point-to-point channel 
with independent and identically distributed (i.i.d.) states available causally at the encoder. Gel'fand and Pinsker 
II25I and, later, Heegard and El Gamal ||26) . studied the case where the encoder observes the channel states non- 
causally. They derived a single letter formula for the capacity using a binning coding scheme. In ll27l . Goldsmith 
and Varaiya considered a fading channel with perfect CSI at both transmitter and receiver They showed that given 
the instantaneous and perfect state information the transmitter can adapt the data rates to each of the channel's 
states, thus maximizing the average transmission rate. 

The impracticality of perfect CSI steered the research of state-dependent channels to consider models involving 
imperfect CSI. At first, different cases of imperfect CSI of an i.i.d. state sequence were treated. Caire and Shamai 
ll28l considered a state-dependent model in which the CSI at the transmitter was assumed to be a deterministic 
function of the CSI at the receiver They have managed to show that the optimal coding scheme is particularly 
simple. Namely, it was shown that optimal codes can be constructed directly over the input alphabet, while in 
general, coding over an expanded alphabet is required. In ll29l . Lapidoth and Steinberg provided an inner bound 
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for the capacity region of the MAC with strictly causal CSI at the encoders. As opposed to the point-to-point case, 
where strictly causal CSI regarding an i.i.d. state sequence does not increase capacity, the MAC's capacity region 
is strictly increased as a result of it. Li et al. presented an improved inner bound for the same setting in [30|. The 
innovative idea of an information theoretical model involving delayed CSI, where the states are not i.i.d., was first 
introduced by Viswanathan who presented and solved the FSM point-to-point channel with delayed CSI 131]. In a 
similar manner to Viswanathan, we model temporal variations by means of a FSM channel ||32]| . Il33l . The channel 
state is determined on a per symbol basis and governed by the underlying FSM process. 

As research of FSM channels with delayed CSI continued to gain momentum, an important extension of this 
novel idea to the multiuser case was introduced. In ll34l . Basher et al. considered the FSM-MAC with delayed CSI 
and ignorant encoders, i.e., where no conference is held [34 1 (see also [35| for a related source coding analysis). 
In the proof of the capacity region for this model, achievability was established by employing a coding scheme 
based on successive decoding. Successive decoding was used in order to demonstrate that the two corner points of 
the capacity region are achievable. The whole capacity region is then achievable via time-sharing. 

In our setting, where conferencing takes place, a different approach is needed. Since the achievability for the 
conferencing model is based on the common message coding scheme we start by solving the FSM-MAC with 
common message and the same CSI properties as in [34], which remained an open problem until the current paper. 
Next, using the achievable scheme of the common message setting, the achievability of the conferencing region is 
established. However, providing an achievable coding scheme for the common message setting based on achieving 
the region's corner points has turned out to be an awkward task. This is due to the large number of comer points 
which are induced by the presence of an additional transmission rate, namely the rate of the common message. 
Therefore, we present a more general scheme that achieves every possible point in the region, rather than just the 
corner points. We use rate-splitting and multiplexing-coding in the encoding stage. Since these ideas were also 
used in ll34l . the structure of the encoders in both schemes is similar The decoding process, on the other hand, 
is utterly different because we consider a scheme by which the decoder decodes all the messages simultaneously. 
Simultaneous decoding gives rise to great difficulties when analyzing the probability of error, yielding a very large 
number of inequalities for the partial rates. Fortunately, using the Fourier-Motzkin elimination, the partial rates 
can be eliminated, and the rate constraints can be expressed via a small number of inequalities, which specify 
the capacity region. The above simultaneous decoding scheme for the MAC with delayed CSI is one of the most 
significant contributions of our paper Not only does it generalize the coding scheme presented in ||34] and can be 
used for the ignorant encoders setting, but also it sets the ground for constructing coding schemes for a general 
number of users. This construction can be done by a direct and trivial extension of the scheme we present here. 

Based on these general results, we then derive the capacity region for the special case of a vector Gaussian FSM- 
MAC with diagonal channel transfer matrices. This channel model can be used to represent an orthogonal frequency- 
division multiplexing (OFDM)-based communication system, employing single receive and transmit antennas. The 
diagonal entries of the channel matrices represent the orthogonal sub-channels used by the OFDM scheme. 

In order to derive the capacity region for the above channel, we use a multivariate extension of a novel tool first 
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derived in ||36l . Using this tool, we demonstrate that Gaussian muhivariate distributions maximize certain mutual 
information expressions under a Markovity constraint. The scalar version of this tool was employed by Lapidoth 
et al. [|37| to provide an outer bound for the capacity region of the scalar Gaussian non-state-dependent MAC with 
conferencing encoders. Wigger and Kremer also used this tool in their solution for the capacity region of the three- 
users non-state-dependent MIMO MAC with conferencing ll38l . The reason why such a tool is needed originates 
in the fact that the input distribution of the conferencing channel must admit a certain Markovity constraint. In 
cases where no Markov relation is to be satisfied, the traditional approach for proving the optimality of Gaussian 
multivariate distributions is by employing either the Vector Max-Entropy Theorem (a direct extension of ll39] 
Theorem 12.1.1]) or a conditional version of it. Here, however, this approach fails since replacing a non-Gaussian 
vector satisfying the Markovity condition by a Gaussian vector of the same covariance matrix may result in a 
Gaussian vector that violates the Markovity condition. In order to overcome this issue we use a sufficient and 
necessary condition on the (auto- and cross-) covariance matrices of the involved Gaussian random vectors (RVs) 
in order for them to admit a Markov relation PO] Section 2, Theorem 1]. 

We note that although Gaussian input vectors are shown to be optimal in this setting, the original form of the 
capacity region involves a non-convex optimization problem. To alleviate this difficulty, new variables are introduced 
to convert the optimization problem into a convex one, which can then be solved using numerical tools such as 
CVX The capacity region for the corresponding scalar Gaussian channel can be immediately derived from the 
result for the vector channel setting, and serves as an extension of the result in fjT] to the state-dependent case. 
The capacity region of the vector Gaussian FSM-MAC with common message and the same CSI properties is also 
easily derivable from the result for the conferencing channel, by exploiting the strong correspondence between the 
two models and using a simple analogy. 

We conclude this paper with a specific example, namely a scalar AWGN channel with two possible states ('Good' 
and 'Bad'), in order to gain some insights into the practical implications of the results. Numerical results are included 
to demonstrate the impact of different channel parameters on the capacity region and the optimal input distribution. 
The interactions between the different parameters are interpreted in a manner that produces valuable insights. 

The remainder of the paper is organized as follows. In Section |ll] we describe the two communication models 
of interest - the FSM-MAC with common message and delayed CSI, as well as the FSM-MAC with partially 
cooperative encoders and delayed CSI. In Sections |lll] and |IVl we state the capacity results for the common 
message and conferencing models, respectively. Each result is followed by its proof. Next, in Section |V] the vector 
Gaussian FSM-MAC with diagonal channel transfer matrices is defined and the maximization problem defining its 
capacity region is derived. The regions for the corresponding common message model and the scalar setting are 
given as special cases. The two-state Gaussian example is discussed in this section as well. Finally, Section [Vll 
summarizes the main achievements and insights presented in this work along with some possible future directions 
and extensions. 
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II. CHANNEL MODELS AND NOTATIONS 

In this paper, we investigate the capacity region of the FSM-MAC with partially cooperative encoders, full CSI at 
the decoder (receiver) and delayed CSI at the encoders (transmitters), as illustrated in Fig. [T] In order to do so, we 
first consider a different setting, which is the FSM-MAC with a common message and the same CSI properties, see 
Fig. |2] The derivation of the capacity region for the common message setting forms the basis for the achievability 
proof for the conferencing case. Since most definitions for both channels follow similar lines, we start by defining 
the common message setting and then extend the description for the setting of partially cooperative encoders. 

Throughout this work we use the following notations. Scalars are denoted by lower case letters whereas random 
variables are denoted by upper case letters. We denote deterministic column vectors by boldface lower case letter, 
namely x ~ (.ti, . . . ,xjv)^ (the vector's dimensions will be stated explicitly), while random column vectors are 
denoted by boldface upper case letters as X. Matrices are denoted by non-italic upper case letters, that is X, and 
finally, we use X" to denote the sequence {Xi, . . . , X„}. 

A. FSM-MAC with Common Message and Delayed CSI 
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Fig. 2: FSM-MAC with a common message, full CSI at the decoder and delayed CSI at the encoders with delays 
di and d2- 



We consider the communication system of FSM-MAC with a common message, CSI at the decoder and delayed 
CSI at the encoders with delays di and d2, as illustrated in Fig. |2] The MAC setting consists of two senders and 
one receiver Each sender j e {1,2} chooses a pair of indices, {mo,mj), uniformly from the set {l, ...,2"^°} x 
{l, 2"^^ }, where toq denotes the common message and nij, j E {1,2}, denotes the private message of the 
corresponding sender The choices of mo, mi and m2 are independent. 

The input to the channel from encoder j e {1,2} is denoted by {Xj^i, Xj^2, ■ ■ ■ ,Xj n}, and the output of the 
channel is denoted by {Yi,Y2, . . . ,Yn}. Using the notations stated at the beginning of this section we simply 
denote the channel's inputs and output by X", j G {1,2} and V^, respectively. 
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The FSM channel is assumed to be, at each time instance, in one of a finite number of states <S = {si, S2, Sfe}- 
In each state, the channel is a discrete memoryless channel (DMC), with input alphabets Xi, X2 and output alphabet 
y. Let the random variables Si and Si-d^, j G {1, 2}, denote the channel state at times i and i — dj, respectively. 
Similarly, we denote by Xi^,, ^2,, and Yi the inputs and the output of the channel at time i. The channel transition 
probability distribution at time i depends on the state Si, and the inputs Xi^i,X2,i at time i, and is given by 
P{yi\xi,i,X2,i, Si). The channel output at any time i is assumed to depend only on the channel inputs and state at 
time i. Hence, 

P{yi\x\,xl,s') = P{yi\xi,i,X2,i,Si). (1) 

The state process, {Sj^Li, is assumed to be an irreducible, aperiodic, finite-state homogeneous Markov chain and 
is therefore ergodic. The state process is independent of the channel inputs and output when conditioned on the 
previous states, i.e., 

P{Si\s'-\x\-\x'2~\y'~^) = P{Si\Si-l). (2) 

Furthermore, we assume that the state process is independent of the messages Mq, Mi and M2, i.e., 

n 

P(s",mo,mi,m2) = Y[P{si\si_i)P{mo)P{mi)P{m2). (3) 

i=l 

Now, let K be the one step state-transition probability matrix of the Markov process and let tt be its steady state 
probability distribution. The joint distribution of {Si,Si-d) is stationary and is given by 

■^d{Si = si, Si-d = Sj) = 7r(sj)K<^(s;, Sj), (4) 

where K'^{si,Sj) is the {l, j)-th element of the d-step transition probability matrix, K'^, of the Markov state 
process. Without loss of generality, we assume henceforth that di > g?2. Furthermore, to simplify the notation, 
we define the joint distribution of the variables (5, ^1, S'2) as the joint distribution of the random variables (RVs) 
{Si, Si-di, Si-d-i), i.e., 

Pss.sM^^i^^-) = <Sj)K'''-''\s„Sj)K'''{si,s,), (5) 

where (sj,s;,s„) & S^. A (n, 2"^°, 2"^i, 2"^^ di, ^2) code for the FSM-MAC with CSI at the decoder and 
delayed CSI at the encoders with delays di and d2 consists of: 



1) Three sets of integers A^o = {1, 2, 2"-^*°}, Mi = {1, 2, 2"^i} and M2 = {1, 2, 2"^=}, referred to 
as the message sets. 

2) Two encoding functions fj, j G {1,2}. Each function fj is defined by means of a sequence of functions 
fj^i (where i denotes the time instance) that depend only on the pair of messages (Mq, Mj), and the channel 
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States up to time i — dj. The output of Encoder j at time i is given by 




l<i<dj 



(6) 



3) A decoding function: 



(7) 



We define the average probability of error for the (n 



, 2"^o , 2"^S 2"^% di , ci2 ) code as 




1 



^Ps"(s")F{^(y",s") ^ (mo,mi,m2)|(mo,mi,m2) 



was sent} , 



2n{Ro+Ri+R2) 



(8) 



where P{yl} denotes the probability of the event A. 

We use standard definitions [39| of achievability and of the capacity region, namely, a rate triplet {Rq, Ri, R2) 
is achievable for the FSM-MAC if there exists a sequence of (n, 2"^", 2"^i , 2"^^ di, ^2) codes with Pi"^ 
as n goes to infinity. The capacity region is the closure of the set of achievable rates {Rq, _Ri, i?2)- 

B. FSM-MAC with Partially Cooperative Encoders and Delayed CSI 

We now define the FSM-MAC with CSI at the decoder, delayed CSI at the encoders with delays di and d2, and 
partially cooperative encoders with cooperation (or conferencing) links of capacities C12 and C21, as illustrated in 



The channel definition relies on Subsection III-AI while taking the common message set to be A4o — 0. 
Here, however, conferencing between the encoders is introduced. The conference is assumed to take place prior 
to the transmission of a codeword through the channel and consists of £ consecutive pairs of communications, 
simultaneously transmitted by the encoders. Each communication depends on the message to be transmitted by the 
sending encoder and previously received communications from the other encoder. We denote the communications 
transmitted from encoder j e {1, 2} to the other encoder by Vj. Note that here the state process is also assumed 
to be independent of the conference communications, i.e.. 



A (n,£,2"^i,2"^^di,d2) code for the FSM-MAC with CSI at the decoder, delayed CSI at the encoders with 
delays di and d2, and conferencing Unks with capacities C12 and C21 consists of: 

1) Two sets of integers Aii ~ {1, 2, 2"^^} and A^2 = {I7 2, 2"-'^^}, referred to as the message sets. 

2) Two encoders, where each encoder is completely described by an encoding function, fj, and a set of £ (i' > 1) 
communication functions, hj^2, ■ ■ ■ , hj^e}, j G {1, 2} (similar definitions were also used in ||2l). 

3) The encoding function, fj, maps the message Mj, j £ {1, 2}, and what was learned from the conference with 



Fig.ffl 



n 



(9) 
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the other encoder into channel codewords of length n. Each function fj is defined by means of a sequence 
of functions fj,i that depend only on the message Mj, the received communications from the other encoder 
in the conferencing stage and the channel states up to time i — dj. We emphasize that since the encoding 
occurs only after the conferencing has finished, each /, ^ depends on all received communications. 

4) Each communication function /ii ^ (respectively, /i2,i), i € {1,2, . . . ,£}, maps the message Mi (respectively, 
M2) and the sequence of previously received communications from the other encoder ¥2^^ (respectively, 
Vi~^), onto the i-th communication Vi^i (respectively, V2.i). More specifically, the communications are defined 
as: 

1/1,, = hi4Mi,V^~') ; V2,^ = h24Mi,V^-'). (10) 

5) The encoding function for Encoder 1 satisfies 

\fi4Mi,Vi), l<t<di 

, (11) 

[fi4Mi,VlS'-'''), di + l<i<n 

and the encoding function for Encoder 2 is defined in an analogous manner (using the private message M2, 
the communications Vi and the delay c?2)- 

6) The random variable Vj^i, for j e {1, 2} and i E {1, 2, . . . ,£} ranges over the finite alphabet Vj,i. In the case 

of partially cooperating encoders, the amount of information exchanged during the conference is bounded by 

the finite communication link capacities C12 and C21. A conference is (C12, C2i)-permissible if the sets of 

communication functions are such that 

e i 
^log|Vi,,;| < nCi2 ; ^log|V2,,| < nCzi. (12) 

i=l i=l 

7) A decoding function: 

iP -.y" X S"" Mix M2. (13) 
The average probability of error for the {n, £, 2"^^ , 2"^=^ , di , ^2) code is 

n'"^ - 2»(^^+^^) ^ E^^"(^"W(2^"'^")^('^i''^2)|(™i,™2) was sent}. (14) 

mi,m2 

The achievable rates and the capacity region are defined in an analogous manner to Section III-AI 

III. THE CAPACITY REGION OF THE FSM-MAC WITH A COMMON MESSAGE AND DELAYED 

TRANSMITTER CSI 

In this section we state the capacity region of the FSM-MAC with common message and delayed transmitter 
CSI, followed by it's proof. Without loss of generality, we assume that di > d2- 
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Theorem 1 The capacity region of the FSM-MAC with a common message, CSI at the decoder and asymmetrically 
delayed CSI at the encoders with delays di and c?2, os defined in Section I//-AI and shown in Fig. |2] is given by: 



CcM = [J 

P{ii\si)P{xi\si ,ji)P(a;2 |si ,S2 -u) 



Ri < I{XuY\X2,U,S,Si,S2), 

i?2 < IiX2;Y\Xi,U,S,Si,S2), 

R1+R2 < I{Xi,X2;Y\U,S, 81,82), 

R0 + R1 + R2 < I{Xi,X2;Y\8,Si,S2). 



(15) 



where U is a RV with bounded cardinality and where the joint distribution of {S, Si, S2) is specified in (Q. 



A. Converse 

Given an achievable rate triplet {Rq, i?2) we need to show that there exists a joint distribution of the form 
P{s, si, S2)P{u\si)P{xi\si,u)P{x2\si, S2,u)P{y\xi,X2, s) such that the inequalities in ( fTSl l are satisfied. Since 
{Rq,Ri,R2) is an achievable rate triplet, there exists an (n, 2"^", 2"^^, 2"^^, di, (^2) code with a probability of 
error Pi"^ that becomes arbitrarily small with the increase of the block length (see (|8]l). By Fano's inequality. 



H{Mo, A/i, M2|r", < n{Ro + Ri + i?2)Pi"' + i^C^^i"') = r^e. 



where clearly e„ — >^ as Pi"'' 0. It hence follows that 



i/(Afi|y",5") < i?(Afo,Mi,M2|r",5") < nen 
F(M2|r",5") < i?(Afo,Mi,M2|r",5") < ner, 
F(A/i,M2|y",5") < ff(A/o,Mi,M2|r",5") < nen 



(16) 

(17) 
(18) 
(19) 



For the sake of brevity we focus here on the upper bound on Ri, while noting that all other upper bounds in 
(fTST i can be derived in a completely analogous manner, using the same auxiliary RV definition. It now follows that 

nRi = H{Mi) 

= H{Mi) + ff(Afi|r",5") - iI(Mi|y",5'") 

< I{Mi;Y'\S")+nen 
I{Mi;Y"\S")+nen 

< i/(A/i|5", Mo, M2) - iJ(Mi|y", 5", A/o, Af2) + nen 

n 

/(Afi; A/o, Ar2, r'-i) + ne^ 

n 

J2 [H{Y,\S", X^, Mo, M2,Y'-^) - H{Y,\S", X^, X^, Mo, A/i, Af2, r'-^)] + ne„ 



(/) 



< [H{Y,\X2,^, S„ S,-d^ , S,-d,,Mo, - H{Y,\S'\ X^, X^, Mo, Mi,M2, Y'-')] 



- ner. 
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i=l 
n 

i=l 

where: 

(a) follows from (fTTj i; 

(b) follows from the fact that Mi and are independent; 

(c) follows from the fact that Mi and (Afo, Af2) are independent given S*" (first term), and the fact that conditioning 
reduces entropy (second term); 

(d) follows from the mutual information chain rule; 

(e) follows from the fact that that X" is a deterministic function of (MqjMijS*") and is a deterministic 
function of (Mq, M2, 5"); 

(f) follows from the fact that conditioning reduces entropy; 

(g) follows from the fact that the channel output at time i depends only on the state Si and the the inputs Xi^i 
and X2X, 

(h) follows by defining U., = {Mq, S'''^^-^). 

Note that the definition of the auxiliary RV Ui represents the common message and the common knowledge of the 
state sequence at time i (except for Si-di), which, in fact, encompasses all common information shared by both 
encoders at this time instance. We can hence conclude that the rate Ri must satisfy the following upper bound: 

1 " 

Ri < - y^I{Xl^^;Y,\X2.^,S,,S^.d,,S,-d,,U,) + En- (20) 

i—1 

In a completely analogous manner it can be shown that 

1 " 

i?2 < - Y I{X2,^■,Y^\Xl,^,S^,S^-d.,S^^d„U^)+en, (21) 
i=l 

1 " 

i?l + i?2 < — > I{Xi,i, X2X,Yi\Si, Si-d-,, Si-diiUi) + En, (22) 

n ^-^ 

1 " 

-Ro + -Rl + -R2 < — 7 i{X\.iT^lXi^i\'^ii'^i-dli^i-d\) ^ ^n, (23) 
n ^ — ^ 



n 

1=1 



The upper bounds in (|20]|-(|23T| can also be rewritten by introducing a new time sharing RV Q, that is uniformly 
distributed over the set {1, 2, n\. For example, the upper bound in (l20t can be rewritten as 

1 " 

-Rl < -^-^(^Q;-'^l,Q|-'^2,Q,'5'Q,S'Q_(j2,'5'Q-(ii,f/Q,Q = + 
1=1 

= i^Q'-,^\,Q\^1..Q-,SQ,SQ^d'2,SQ^dx-,VQ,Q) +£„. (24) 

Repeating similar steps for all other upper bounds and denoting X\ = Xi^q, X2 = X2,q, Y = Yq,S = Sq,Si = 
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SQ^di,S2 = SQ^d2 and U = {Uq,Q), we get: 



Ri <I{Xi-Y\X2,U,S,Si,S2) + e, 



R2<I{X2;Y\Xi,U,S,Si,S2)+e, 



R1+R2 <I{Xx,X2]Y\U,S,Si,S2)+e, 



Ro + Ri+R2<I{Xi,X2;Y\S,Si,S2)+e. 



Taking the limit as n — > 00, one obtains the bounds as in ( fTsT l. 



To complete the proof of the converse it is left to show that the following Markov relations hold: 



P{u\s, Sl, S2) 



P{u\h), 



(25) 



P{xi\s,Si,S2,u) 



P{xi\si,u), 



(26) 



P(X2|X1,S,S1,S2,W) 



P{X2\S1,S2,U). 



(27) 



The proof of these relations is provided in Appendix |A] 
B. Achievability Proof 

The proof of the achievability part relies on rate splitting, multiplexing coding and joint decoding. Fourier-Motzkin 
elimination is employed to reduce the number of inequalities induced by the error probability analysis. 
To establish achievability, we need to show that for a fixed distribution P{u\si)P{xi\u, si)P{x2\u, si, S2) and rates 
{Rq,Ri, R2) that satisfy the inequalities in ( fTsl l. there exists a sequence of (n, 2"'"« , 2"^i , 2"-"^ , di , ^2) codes such 
that pi"-* — > as 7T, — > 00. Without loss of generality, we assume that the finite-state space is the set 5 = {1, 2, fc} 
and that the steady state probability satisfies ti{1) > for all I G S. Consider now the following encoding and 
decoding scheme. 

1} Encoder 1: Encoder 1 constructs k codebooks Cq^ (the subscript designates the message mp) for all si € S. 
Each codebook Cq^ comprises 2"-^'^^'^^^°'^^^^ codewords, where ni(si) — {P{Si = si) — t'^ ■ n for some e' > 0. 
Each codeword in the codebook Cg^ is composed of ni(si) independent realizations of the random variable U^^, 
distributed identically according to P{u'^^\Si = si). 

Then, for every codeword Co^(i) in the codebook Cq^ i € {1, 2, 2"i(*i^-'^''(''i^}, Encoder 1 generates a 
codebook C,f\ comprises of 2"i(''i)^i(*i) codewords. Each codeword in the codebook Cl\ is composed of ni(si) 
realizations of the random variable which are i.i.d according to P{x\^ \u^^ , Si = si). A message pair (Mq, Mi) 
is chosen according to a uniform distribution F{Mq ~ mg, Mi = nii) = 2^"(^o+fli)^ where toq G {l, 2, 2"^''} 
and nil E |l, 2, 2"-'^i }. Every message tuq is split into k sub-messages niQ = {too.Ij ™o,2j ™o,fc} 
(corresponding to each of the possible states), where the length of the sub-message mo^jj is length 7ii(si)i?o(si)- 
Hence, every message toq is specified by a fc-dimensional vector. Similarly, every message mi is split into k 
sub-messages mi = {mi^i, mi^2, rni^k} and is thus also specified by a fc-dimensional vector The length of the 
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sub-message rrii^s-^ is length ni{si)Ri{si). 

For a fixed block length n, let Ng-^ < n be the number of times at which the feedback information at Encoder 
1 regarding the channel state is = si. Whenever the delayed CSI is Si — si. Encoder 1 sends the next symbol 
from the codebook C^\, where i is specified by the value of the sub-message of niQ that corresponds to the state si. 
The construction of the codebooks for the sub-common and sub-private messages corresponding to some delayed 
CSI si = ^ is illustrated in Fig. [3] Encoder 1 repeats this construction for every £ E S. 



1,1 



•-0 



fsi- 
^0 



^(1) 



ni(si = i) 



'-'1,1 


^(1) 


psi = 
'-'1,1 










^ 





ni{si = e) 



'-2rii(si=^)Ro(5l=(') 1 





= f)R(,(si 




^2"l(Sl 


= f)H0( = l 













Fig. 3: Multiplexing coding with common message. Encoder I's codebook is assembled from the common and 
private message codebooks. A common message codebook, Cq^~^, is generated for every sub-common message 
moj, £ € S, each containing 2"i(''i^^)^''(''i^^) codewords. For every codeword in the £-th common message 
codebook a private message codebook is generated, each containing 2"i(''i=^)^i(*i=^) codewords. 



2) Encoder 2: Encoder 2 first constructs k codebooks Cq\ for each si G S, in an analogous manner to Encoder 1. 
Then, for every codeword Cq^ (i) in the codebook Cq\ i G {1, 2, 2"i(*i)^o(si)|^ Encoder 2 generates k codebooks 
C^2^^ for every §2 G S. Each codebook is comprised of 2"2(si,s2)i?2(si,s2) codewords, where 712(51, S2) = {P{Si = 
81,82 = S2) — e") -n for some e" > 0. Each codeword in the codebook €'^2''^ is composed of ri2(si, S2) realizations 
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of the random variable which are i.i.d according to P(a;2^'*^ 5i = §1,82 = S2). The mapping of the 
message mo is similar to the one defined for Encoder 1, while the mapping for m2 is as follows. A message 
1712 € {1, 2, 2"^^} is chosen according to a uniform distribution P(M2 = 1712) = 2~"^^. Every message m2 is 
split into kx k sub-messages m2 = {m2,i,i , m2,i,2 , ■ ■ ■ , TO2,fe,fe} (corresponding to each of the possible pairs 
of states). Each of these sub-messages is of length n2(si, §2)^2(51, S2)- Hence, every message m2 is specified by 
a fc^-dimensional vector. 

For a fixed block length n, let iVji.Sa be the number of times for which the feedback information at Encoder 2 
regarding the channel state is {81,82) = (si, 52)- Whenever the delayed CSI pair is {81, 82) = (Si, §2). Encoder 
2 sends the next symbol from the codebook C^^*^. where i is specified in a similar manner to its counterpart in 
Encoder 1. 

3) Decoding process: The proposed decoding rule aims to achieve capacity by simultaneous decoding. Since 
full instantaneous CSI is assumed at the receiver, the information about Si and {81,82} used at the encoding stage 
is also available at the decoder. 

Firs, we note that Ng-^ (respectively, Ng-^^g^) is not necessarily equivalent to ni(si) (respectively, n2(.si,S2)). 

Therefore, the decoder declares an error if Ng-^ < ni(si) (respectively, Ng^^g^ < ?i2(si,S2)), while the code is 

zero-filled if Ng-^ > ni(si) (respectively, Ng-^^g^ > ^^2(sl,S2)). 

The decoding process is performed in blocks of size ni(si) corresponding to the delayed CSI 5*1. Upon receiving 
a block of channel outputs and states the decoder first demultiplexes it into outputs corresponding to 

the component codebooks of Encoder 1. The demultiplexing is done using the delayed CSI Si, which is known at 
the decoder. Then, the decoder simultaneously searches for fc-|-2 unique sub-messages (ttiosi, ^iSn iii2(si)), that 
satisfy a certain typicality constraint to be specified next. Here rii2(si) is the fc-dimensional vector of sub-messages 
of m2 for which 6*1 = si is fixed. The typicaUty constraint to be satisfied is: 

(28) 

for a given ^i. By 82^^^^^ we refer to the sequence of ni(si) delayed channel states ^2. Moreover, U"''^^^^\mosi) 
represents the component codewords of the common sub-message fhos^, while X'^'^''\mos, , misi) is the 
component codeword of the sub-message pair (jfiogi, misi). Finally, X^^^*^^(mosn in2(Si)) is constructed by 
de-multiplexing the component codewords Xj^^**^'**^^ using a common 81 = §1. 

As mentioned earlier the decoding process is done in blocks of size ni(si), for each si G S. Thus, whenever 
the decoder succeeds in decoding each set of fc -|- 2 sub-messages (which consist of the k sub-messages in m2(si) 
as well as the sub-messages mosi and mig^), it is innmediately clear which triplet of messages {1710,1711, 1112) was 
sent, since each of the messages is uniquely determined by its sub-messages. 

By error probability analysis we show that the probabiUty of error, conditioned on a particular codeword being 
sent, goes to zero if the following conditions are met: 



R'l < I{Xi;Y\X2, U, 8, 81 = §1,82), 



(29) 
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P{e)R2i < Pi^)HX2;Y\Xu u, s, Si = Si,S2 = i), oo) 

R'l + Pi^)R2i < I{Xi;Y\X2,U, S, Si = si, ^2) + Y PiWX2;y\U, S, Si = h, §2 = £), (31) 

R'o + R'i+R2<I{Xi,X2;Y\S,Si = Si,S2), (32) 

for every possible subset Sp C S. The subset Sp consists of all the states §2 E S for which a decoding error 
occurred and the reconstructed sub-message is incorrect (the subscript F stands for 'False'). Moreover, (l29^-(l32t 
are formulated using the following notations: 

Roih) - R'o, (33) 

= (34) 

R2{h,S2 = i) = R2e, (35) 

¥{82 = £\Si = rsi) ^ Pi. (36) 

We also define 

k 

i?2(Sl) =^P£i?2£, (37) 

i=l 

and denote 

i?2(Si)=i?2. (38) 

For the full detail and notation see Appendix IbI 

Next, using the Fourier-Motzkin elimination (FME), Appendix |C] shows that the set of inequalities of the form 
( |30] | and ( 1311 1. which must hold for every Sp, are equivalent to 

R'2<IiX2;Y\Xi,U,S,Si = rsi,S2), (39) 

R[ + i?2 < /(^i, ^2; Y\U, S, Si = Si,S2). (40) 

Combining ( |29l ). (l32b . with (3% and (l40l l. then in order for Pg — > as n — > 00, the following conditions must 
hold, 

Pi (Si) < I{Xi;Y\X2, U, S, Si = Si, ^2), (41) 

P2(Si) < I{X2;Y\Xi,U, S, Si = Si, ^2), (42) 

Pi (Si) + P2(Si) < I{Xi,X2; Y\U,S,Si = Si, ^2), (43) 

Po(Si) + Pi(Si)+P2(Si) </(Xi,X2;y|5,^i =Si,^2). (44) 



Considering the inequality in JTTl l. the result can be extended to address all codebooks Ci\, for every Si G S, 
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in the following way. We define 



^ — ' n 

Sies 



Using ( |45] ) we have 



R,<y] 'llif2li(^Xi;Y\X2,U,S,Si^h,S2) 

^ ^ Tl 



n 



= IiXuY\X2,U,S,Si,S2)-e", (46) 
where e" = e' • I{Xi;Y\X2,U, S, Si = 31,82). 

Performing a similar procedure for each of the bounds in (I42i -(|44|). we get that in order to have Pg ^ as 
n — >■ 00, the following rate constraints must be satisfied 

Ri<I{Xi;Y\X2,U,S, 81,82), (47) 
R2 < /(X2; U, 8, 81,82), (48) 

Ri+R2<IiXi,X2;Y\U,8,Si,S2), (49) 
Ro + Ri + R2 < I{Xi,X2;Y\S,Si,S2), (50) 

which holds by assumption. 

Concluding, we have shown that if a rate triplet {Rq, i?2) is inside the rate region given in Theorem [T] then 
there exists a sequence of (n,2"'«^2"■«l,2"■«^(^l,^^2) codes such that Pi"^ ^ as n — > 00. This completes the 
proof of the achievability part. 



IV. THE CAPACITY REGION OF THE FSM-MAC WITH PARTIALLY COOPERATIVE ENCODERS 

AND DELAYED TRANSMITTER CSI 

In this section we state the capacity region of the FSM-MAC with partially cooperative encoders and delayed 
transmitter CSI, followed by it's proof. Without loss of generality, here we also assume that di > ^2- 



Theorem 2 The capacity region of FSM-MAC with partially cooperative encoders, cooperation link capacities C12 
and C21, CSI at the decoder and asymmetrically delayed CSI at the encoders with delays di and d2, as shown in 
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Fig. [7] is given by: 



epc= U 

P{u\si)P{x\\si ,U)P{X2\SI ,S2,u) 



Ri < IiXi;Y\X2,U,S,Si,S2) + Ci2, 

R2 < IiX2;Y\Xi,U,S,Si,S2) + C2i, 

R1 + R2 < I{Xi,X2;Y\U,S,Si,S2) + Ci2 + C2u 

R1+R2 < I{Xi,X2;Y\S, 81,82). 



(51) 



where U is a RV with bounded cardinality and where the joint distribution of (8, 81, 82) is specified in Q. 



A. Converse 

Given an achievable rate (i?i,i?2), we need to show that there exists a joint distribution of the form 
P{s,si,S2)P{u\si)P{xi\si,u)P{x2\si,S2,u)P{y\xi,X2,s) such that the inequahties (ISTT i are satisfied. Since 
(i?i,i?2) is an achievable rate-pair, there exists an (n, Z, 2"^^ , 2"^^, di, ^2) code with an arbitrarily small error 
probability Pe^\ By Fano's inequality, 

iJ(Mi,M2|r",5") < +i?2)Pi") = n£„, (52) 

(n) 

(with some abuse of notation) where — ^ as — > 0. It hence follows that 

i/(Afi|y",5") < ff(Afi,Af2|>"",^") < n£„, (53) 
F(Af2|r", 5") < i?(Afi, Afair", 5") < (54) 

As in the proof of Theorem [T] we focus on the upper bound on Ri and note that the upper bounds on all other 
rates can be straightforwardly obtained in an analogous manner For i?i we have the following: 

nRi = H{Mi) 

= H{Mi) + H{Mi\Y", S'") - H{Mi\Y", 8") 

< /(Afi;r",5")+n£„ 
'=^/(A//i;y"|5")+n£„ 

< i7(Afi|5", Af2) - H{Mi\V,', Vl F", 5", Af2) + n£„ 
I{Ah; Vl Af2) + /(Afi; 5", Af2) + «£„ 

n 

A/2) + H{V^\Vl 8", M2) + I{A'h-Y,\Vl 5", Af2, Y'-^) + n£„ 

n 

H{V,'\8", M2) + [H{Y^\V^, 5", A/2, r'-^) - H{Y,\Vl 5", A/i, A/2, F^-^)] + n£„ 

(3) 

< i/(F/) + Y ^2", M2, Y^-^) - //(y.iif, F/, ^r, ^2 , ^i, a/2, f^-^)] + n£„ 
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(h) 



n 



j=i i=i 
- H{Y,\V,', Vl 5", X^, X^, Ml, Af2, Y'-^)] + ne„ 
(i) ^ " 

j=l 1=1 

~ H{Yi\Xi^i,X2A, Si, Si^di,Si-d2iVi 1^2,3^ ^)]+n£„ 
0) ^ 

j=l 1=1 

< nCi2 + 2^/(Xi,i; Yi\X2,i., Si, Si^di, Si^d^^Ui) + ne„ 



where: 

(a) follows from (l53] i; 

(b) follows from the fact that Mi and S"' are independent; 

(c) follows from the fact that Mi and M2 are independent given S"" (first term), and the fact that conditioning 
reduces entropy (second term); 

(d) follows from adding and subtracting the term H{Mi\Vf , V2, 5*", M2); 

(e) follows from the entropy and mutual information chain rules and the fact that Vi and are fully determined 
given iS'\Mi,M2); 

(f) follows from the fact that V2 is a deterministic function of (M2, Vf); 

(g) follows from the fact that conditioning reduces entropy (first term), the fact that X" is a deterministic function 
of {Ml, Vi, S") and that X2 is a deterministic function of (A/2, V2, S") (second and third term); 

(h) follows form the entropy chain rule (first term) and the fact that conditioning reduces entropy (second term); 

(i) follows from the fact that conditioning reduces entropy (first term) and the fact that that the channel output at 
time i depends only on the state Si and the inputs Xi^i and X2.i (third term); 

(j) follows from the upper bound of entropy; 

(k) follows from defining U, = {Vf, V^, S'"'^^-^). 

Note that we defined here the auxiliary RV U at time i as Ui = (V^/, V2^, S''^'*^^^). Hence, it represents the 
information shared during the conference (i.e., the parts of the private messages available to both encoders) and the 
common knowledge of the states. As was the case for the common message setting (cf. Theorem [T] and Subsection 
IIII-Al l. U here represents all the common information available to both users. 

Applying similar arguments to R2 and Ri + R2 one can conclude that any achievable rate-pair (i?i,i?2) must 
satisfy the following inequalities: 




(55) 
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1 " 

-^2 < — > I{X2,i]Yi\Xl,iTSi,Si-d2TSi^di,Ui) +C21+ En, (56) 

n ^ — ' 

1=1 

1 " 

-Ri + i?2 < -^/(Xi,„X2,^;r^|5„5,_d,,5,_d,,[/0 + Ci2 + C2i+£„, (57) 
1=1 

1 " 

i?i + i?2 < — > ^ I{Xi i,X2,i] Yi\Si, Si-d2, Si-^di) + £n- (58) 



n 

1=1 



The expressions on the RHS of the inequaUties in ([55ll-(l58]l represent empirical averages of mutual informations 
(taken over the code symbols). These inequalities can be alternatively represented by introducing a new time-sharing 
RV Q, uniformly distributed over {1, . . . ,n}, as in Subsection IIII-AI Starting again with the upper bound on Ri, 
this yields 

1 " 

Ri < - ^^I{Yq;Xi^q\X2.q, Sq, SQ-d2,SQ-di,UQ,Q = i) + C12 + e„ 
1=1 

= lO^Qy^l,Q\^2,Q, Sq, SQ-d2,SQ-diiUQ,Q) + Ci2 + £„ (59) 

Applying the same procedure to the rest of the upper bounds, while denoting Xi = Xi q,X2 = X2.q,Y = 
Yq,S ^ Sq, Si ^ {SQ-d„Q), S2 = SQ-d2 and U ^ {Uq,Q), we get 

Ri < IiXi;Y\X2, U, S, Si,~S2) + C12 + En, 

R2 < IiX2;Y\Xi,U,S, Si,S2) + C2i+en, 
Ri+R2< IiXi,X2; Y\U, S, 81,82) + C12 + Cii + £„, 
Ri+R2<I{Xx,X2;Y\8, ^1,^2) +£„. 

To complete the proof of the converse it is left to show that the following Markov relations hold: 

P{u\srsirs2)^ P{u\~si), (60) 
P{xi\s,si,S2,u) ^ P{xi\si,u), (61) 

P(X2|X1,S,S1,S2,W) = P(X2|S1, S2,u), (62) 

The proof of these relations is given in Appendix |D] 

in) 

It can thus be concluded, by taking the hmit as n — > cx), Pe — > 0, that the following holds 

Ri < I{Xi;Y\X2, U, 8, 81,82) + C12, (63) 

R2 < I{X2; Y\Xi, U, 8, 81,82) + C21, (64) 

Ri+R2<I{Xi,X2;Y\U,8,Si,S2) + Ci2+C2i, (65) 

Ri+R2<I{Xi,X2;Y\8,Si,S2). (66) 
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for some choice of joint distribution P(s, si,S2)P{u\si)P{xi\si,u)P{x2\si,S2,u)P{y\xi,X2,s). 



B. Achievability 

To prove the achievabiUty of the capacity region, we need to show that for a fixed distribution of the form 
P(u|si)P(a;i |u, si)P(x2|u, si, S2) and for (Pi,P2) that satisfy the inequalities in ( BTT l. there exists a sequence of 
(n, I, 2"-^! , 2"-"^ , di, ^2) codes for which pj"^ as n ^ 00. 

The idea behind this proof is to convert the conferencing problem into a setting that corresponds to the FSM-MAC 
with common message considered in Section |III1 and then rely on its capacity result to show that the conferencing 
capacity region is, indeed, achievable. This is done by sharing as much as possible of the original private messages 
(mi, 7712), through the conferencing links, in order to create a common message. The unshared parts of the original 
messages serve as the private messages in the new setting. By doing so, the coding scheme of the setting with a 
common message, as detailed in Section [nil can be employed. 
We start by defining: 

i?i =min{Pi,Ci2}, (67) 
^2 = min{P2,C2i}. (68) 

With respect to these definitions, the inequalities in (ISTT l can be rewritten as 

{Ri - Ri) < I{Xi;Y\X2, U, S, Si, §2), 
(P2 - R2) < I{X2;Y\Xi, [/, S, 81,82), 
(Pi - Pi) + (P2 - P2) < I{Xi,X2;Y\U, 8, 81,82), 
iRi+R2) + iRi-Ri) + iR2-R2) <I{Xi,X2;Y\8,Si,S2). (69) 

In view of this representation, we construct a coding scheme by splitting the sets Aij = {1, 2, . . . , 2"^^}, for 
j e {1,2}, into 2"^^ cells, each containing 2"^^^"^^^ messages, and introducing the functions 

Cl : Mi^{l,2,...,2"^i}, (70) 

C2 : M2^{1,2,...,2"^-}, (71) 

ei : Xi {l,2,...,2"(^i^^i)}, (72) 

62 : 7W2 ^ {l,2,...,2"(^^-^=)}. (73) 

That is, for every message nij, where j € {1,2}, Cj returns its cell number, Cj{mj), while ej returns its index 
number, ejirrij), inside the cell Cj{mj). For the sake of simplicity, we assume here that 2"^\ 2"^^^ 2"(-'^i~-f^i) 
and 2"(^2-^i2) ^j.g integers, although the same approach can be formalized for real numbers as well. Also note that 
the partitioning is deterministic. 
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Now, for every message pair (mi, 7712) let us define the triplet (ttiq, to'i, m'j) where 

m\ = ei(rni), (74) 
m'2 = 62 (m2), (75) 
mo = (ci(mi),C2(m2)). (76) 

Note that the above definitions dictate that m'^ G {1, 2, . . . , 2"(^i"^i)}, m'^ E {1, 2, . . . , 2"(^2"^2)} ^nd m[) e 
{1,2,..., 2"^!} X {1, 2, ... , 2"^4. Since, by definition, 

i?i < C12, (77) 
i?2<C2i, (78) 

it is possible for Encoder 1 to transmit ci(mi) to Encoder 2, and for Encoder 2 to transmit C2(m2) to Encoder 
1, via the respective conferencing links. Therefore, following the conferencing stage, both encoders know 
(ci(mi), C2(m2)), which can be viewed as a common message, and henceforth denoted as m^,; m[ and m2 are 
viewed as the new private messages. 



The above setting can hence be viewed as a FSM-MAC with common message. The messages to be transmitted 
are given by the triplet (m^, m'^, m^, where mp € {1, 2, . . . , 2"^i} x {1, 2, . . . , 2'"^^^}, m'^ e {1, 2, . . . , 2"(^i--^i)} 
and m'l e {1, 2, . . . , 2"(^i^^i)}, while (|69] | holds by assumption. By Theoremfl] it now immediately follows that 
the new message triplet {■mQ,m[,m2) can be transmitted to the decoder with an arbitrary small probability of 
error. The decoder can, therefore, reliably reconstruct the message pair (mi, m2) and the rate-region (ISTT l is hence 
achievable. 



V. THE VECTOR GAUSSIAN FSM-MAC WITH DIAGONAL CHANNEL TRANSFER MATRICES, 

CONFERENCING AND DELAYED CSI 

In this section we consider the vector Gaussian FSM-MAC with diagonal channel transfer matrices, partially 
cooperative encoders and delayed CSI. For every time instance t e {1, • . . the channel model in concern is 
given by: 

Yt = Gi(st)Xi,t + G2(sOX2,t + Zt, (79) 

where {Gi(s)}^^^ and (62(5)}^^^ are N x N diagonal matrices, which are deterministic functions of the channel 
state S = s (for simplicity, we henceforth omit the time index t). We denote the diagonal entries of these matrices 
by gi.i{s) and g2,i{s), respectively, for i £ {1,...,A^}. Moreover, we assume Gi(s),G2(s) e C^^^ for every 
s E S. Xi,X2 E and Y E are the channel input vectors and the channel output vector, respectively. Z is 
a proper complex Gaussian vector, independent of Xi and X2 and distributed according to Z ^ CJ\f{Q, I), where 
I is the identity matrix of dimensions N x N. The input vector signals are assumed to satisfy the average power 
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constraints 

tr(SxixJ<^i ; tr(l]x,X2) <7'2, (80) 

where we use the standard notation Y,xy = IE[XY'^], where At denotes the conjugate transpose of the matrix A. 

The motivation for examining the channel mode in (|79] | stems from the fact that it can be used to represent 
an OFDM-based communication system, employing single receive and transmit antennas. OFDM is an efficient 
technique for mitigating frequency selective fading, which are typical to modem wide-band communication systems 
(see, e.g., IH, B2ll ). The underlying idea behind OFDM is to split the channel's bandwidth into N separate sub- 
channels through which orthogonal signals are transmitted. By doing so, not only is the effect of intersymbol 
interference (ISI) dramatically reduced, but the transfer functions of each of the sub-channels boil down to 
multiplicative scalar gains. These gains are modeled by the diagonal entries of the channel matrices defined above. 
In this section we derive the maximization problem that defines the capacity region for the vector Gaussian channel 
in concern and convert it into a convex form. The solution of this convex maximization problem, which can be 
easily obtained using a numerical tool such as CVX [41] , also yields the optimal power allocation strategy among 
the sub-channels, which is another essential factor in an OFDM-based transmission. 



A. Capacity Region 

Theorem 3 The capacity region of the power-constrained vector Gaussian FSM-MAC with diagonal channel 
transfer matrices, partially cooperative encoders, cooperation link capacities C12 and C21, delayed CSI and average 
power constraints {Vi,V2) is the union of all sets of rate pairs (i?i,i?2) G IK+ satisfying 

N 

Ri<J2 E K'''-''Hh,Si) *~2) E (1 + \9iAs)\^liA~si)) + C12, (81) 

Si §2 s i — 1 

N 

R2<Y. E ^'^"''(^^2, II) ^''(^' ^^2) E (1 + \92.M?l2AhrS2)) + C21, (82) 

Si S2 s i — 1 

N 

R1 + R2 < ^7r(Si)^if'^i-'^^(S2,Si)^/^'*^(s,S2)5]log(l + \giAs)fliAh) + \92As)fl2ASi,S2)) 

Si S2 s i—1 

+ C12 + C21, (83) 

N 

Ri + R2<Y1 '^(^i) E K'''-''Hh,h)Y ^'''(^' ^2) E log (1 + \9iAs)\^PiAh) + \92As)\^P2AS2,h) 

Si S2 s i—1 

+ 2gi,j(s)gL(s)y^ (^"1.^(51) - nA^i)) {P2ASUS2) - 72.»(si, 52))) , 

(84) 
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where the union is taken over all {7i,i(si)},e{i,...^Ar},5^g5. {72,«(si, S2)},g{i_ ..^^g^, 
{PiA^i))i(.{i^...^N},~si&S ""^ {^2,i(si,S2)}.g|^ that satisfy the constraints: 

N 

J2^isi)Y,PiASi)<Vi, (85) 

si i—1 

N 

^4si)^i^'^^-^nS2,5~l)^P2,.(Sl,S2) <P2, (86) 

Sl S2 i=l 

< 7i,.(si) < PiAh)y I e {1, . . . ,iV}, Sl e 5, (87) 
< 72,.(Si, S2) < P2,^(5i, S2), V ie{l,...,N}, (Sl, S2) € (88) 

The corresponding capacity region for the analogous setting with a common message can be obtained by taking: 

^0 = C12 + C21 ; Ri = max{0, i?i - C12} ; R2 = max{0, R2 - C21}, (89) 

where Rq denotes the common message rate, and Ri and R2 denote the rates of the private messages (according 
to the common message channel definition in Subsection III- Al l. 

Corollary 4 The capacity region of the power-constrained vector Gaussian FSM-MAC with diagonal channel 
transfer matrices, a common message, delayed CSI and average power constraints (Vi, V2) is the union of all sets 
of rate triplets {Rq, Ri, R2) G M'^ satisfying 

N 

i?i < E ^(^^1) E ^'^"'nS2, Sl) J2 ^'Hs, S2) E log (1 + l5i..(s)l'7i,.(si)) , (90) 

Sl S2 s i—1 

N 

i?2 < E ^(^^1) E Sl) E ^''(«' E (1 + I52,.(S)P72,,(S1, S2)) , (91) 

Sl S2 s i—1 

N 

i?i + i?2 < E ^(^^1) E ^'''"'''(s~2, Sl) E ^'''(«' «~2) E (1 + l5i..(s)p7i,.(Si) + |<?2,»(s)p72,,(Si, S2)) , 

Sl S2 S i—1 

(92) 

AT 

i?o + i?i + i?2 < E ^(^1) E K'''-'''{S2,h) E ^'"'(5' «~2) E log (1 + l5i,.(s)pPi,»(Si) + \g2.As)\^P2AS2, Sl) 

Sl S2 s i—1 



+ 2ffi,i(s)ff2,i(s) 




-~2))), 



where the union is taken over the same domain satisfying the constraints (I&5l)-(IM1). 

Note that the regions in Theorem |3] and Corollary 2] are both given in the form of a convex optimization problem, 
which can be solved efficiently using numerical tools. In the following proof we first derive a slightly different, 
yet equivalent, region for the Gaussian conferencing model; this original capacity region involves a non-convex 
optimization problem. Then, by defining new variables we convert the optimization problem into a convex one. 
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Proof: A straightforward extension of the result stated in Theorem |2] yields the capacity region of the general 
vector FSM-MAC with partially cooperative encoders and delayed CSI. The region is given by the closure of the 
set of rate pairs (i?i, i?2) e M+ that satisfy (cf. (ISTT i) 

i?i </(Xi;Y|X2,U,5, ^i,^2) + Ci2, 
i?2 </(X2;Y|Xi,U,5, ^i,52) + C2i, 
Ri+R2< /(Xi,X2;Y|U,5,5i,^2) + Ci2 + C2i, 

i?i+i?2 </(Xi,X2;Y|5,5i,52), (94) 
for some joint distribution of the form 

P(u|5i)P(xi|5i, U)P(X2|S1, §2, u), (95) 

where U is an auxiliary random vector with bounded cardinality. Note that the structure of the conditional PDF in 
(|95] l impUes the Markov relations: 

U-5i-(5,52), (96) 
Xi-(U,5i)-(5,^2), (97) 
X2-(U,5i,52)-(5,Xi). (98) 

The proof of Theorem |3] consists of two main parts. First, we provide an outer bound for the general capacity 
region in (|94l i. Then, by choosing a jointly Gaussian distribution for (Xi,U,X2), we show that the upper bound 
is indeed achievable and thus characterizes the actual capacity region. 

The outer bound for the capacity region is obtained by substituting the RVs (Xi, U, X2) in ( |94] l with appropriately 
chosen jointly Gaussian RVs (X.^, V"-^, X2^), which satisfy a certain Markovian relation. We conclude that the RVs 
(Xf^, V'^jX^) indeed admit the desired Markov relation using the following lemma ||40l Section 2, Theorem 1]. 



Lemma 5 Let (A, B, C) be jointly Gaussian random vectors. Then (A, B, C) form a Markov chain A — B — C 
if and only if their covariance matrices satisfy: 

Sac = ^ab^bb'^bc- (99) 

As before, we restrict the detailed derivation to the upper bound on while noting that all other bounds in 
(|94] i can be straightforwardly treated in an analogous manner. To this end, we rewrite the bound on Ri as 

Ri<Y. '^(^i) Y. ^'''"'''(s2,Si)^iC'*^(s,S2)/(Xi;Y|X2,U,s,Si,S2) (100) 
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and proceed with upper bounding each of the mutual information terms in the sum. Consider: 



/(Xi; Y|X2, U, s, 5i, 52) = /i(Gi(s)Xi + Z|U, s, Si) - h{Z) 

< h{Gi (s)Xi + Z|V, s, si) - h{Z) 



< h{Giis)X'{ + 21^^,5, Si) - h{Z) 

N 

< E {H9iAs)X^,^ + ^^.''k, 5i) - HV^'lh)} - h{Z) 



(101) 




|E[Xg(F,g)1gi 
E[|yG|2|s^] 



log (ttc) 



TV 
i=l 



log \^l + \gi4s)\'Pi4h) 
\0g{l + \gi4s)\^Pi4h)PiASi) 



p,4hn\v/'\'\si] 



(102) 
(103) 



where: 

(a) follows from ( |79] l and the Markov relations 

(b) follows from substituting the RV U for any given 5*1 — si with a new RV, V(Si) = E[Xi|U,.?i]. Note that 
this is in fact the optimal estimator in the minimum mean square error (MMSE) sense of Xi given U, for each 
specified delayed CSI 5*1 = si. Substituting U for any given Si — si with V(si) increases the first entropy term 
in view of the fact that V(si) is a deterministic function of the pair (U, si), whereas h{Z) is not affected by 
the substitution. Moreover, one can easily confirm that (Xi, V,X2) satisfy the covariance condition ( |99] l. i.e., the 
relation 

SxiX2(si,S2) = ^Xivisi)'Eyy{si)'Evx2isi, h) (104) 

holds for every (si, S2) E S^. Note that the dependance of the covariance matrices on the states is induced by the 
Markov relation (|96^-(|98^: 

(c) follows from the maximum differential entropy lemma P3l Section 2.2] which states that the differential 
entropy, /i(X|Y), for a pair of RVs (X,Y) distributed according to /xY(x,y), with covariance matrices 
T,xx and Syy, is maximized for jointly Gaussian (X,Y). Therefore, introducing the triplet (Xf', V*^, Xj') of 
zero-mean jointly Gaussian RVs with the same auto- and cross- covariance matrices as those of (Xi, V, X2), 
and replacing (Xi,V,X2) with (Xf , V-^, X^), increases the first entropy term. Moreover, by Lemma we 
conclude that the Gaussian triplet (Xf^, V*^, X^), for any given (5, 51,5*2) — {s, 81,82), is Markov, i.e., 
Xf (Si) - VG(Si) - X^ih, h) holds. 

(d) follows from explicitly evaluating each of the entropy terms; 
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(e) follows from defining Pi^i{si) = E[|Xi^ip|Si] and P2,i{si,S2) — E[|X2,ip|si, §2] (note that these are in fact 
the i-th diagonal entries of the covariance matrices I^xOx^iSi) and I]xGx^{Si, S2), respectively. For this reason, 
the constraints in (|85]|-(|86]| follow immediately from (l80t by applying the law of total expectation); 

(f) follows from defining 

2 



/32,i(si,S2) = 



E[|XGp|s^]]E[|^G|2|g^] 
IE[^.^(^2^.)l5i,S2] 



1215, 



S2]E[|y,G|2 



Pi,,(Si)E[|yG|2|s^]' 



\E[V^^iX^,drsu~S2]r 
'P2ASi,S2)E[\V^\rsi] 



(105) 



(106) 



where we use the notation a = 1 — a, a e R. 

Note that since /3i.i(si) (respectively, ^2,i(si, S2)) is defined to be the squared correlation coefficient between X^^ 
(respectively, X2j) and Vf^ for a given delayed CSI 5*1 = si (respectively, delayed CSI pair {Si, S2) — {si, S2)), 
we have that /3i.i(si), /32,i(si, S2) G [0, 1] for every i e {1, . . . ,N}. The upper bounds on R2, and both upper 
bounds on the sum rate Ri + R2 are constructed in a similar manner. 



Next, we show that the upper bounds are also achievable. In order to do so, we take (Xi,U, X2) to be zero- 
mean jointly Gaussian RVs that admit the Markov relations (|96]|-(|98]|, and for which the auto- and cross- covariance 
matrices J^XixAsi), 2x2X2(51,52), T,uu{si), ^Xiu{si) and ^X2uisi,s2) are diagonal for every (si,S2) e S^. 
Specifically, we take 

^x^xASi) = diag({Pi,,(Si)}^^), (107) 
2x2X2(51,52) =diag({P2.(Si,S2)}^^), (108) 

and denote the diagonal entries of the three remaining covariance matrices by ct^ (si), E[Xi iC/*|si] and 
E[X2,iC/* |si, S2], respectively. Moreover, (Xi,U, X2) are chosen to have the same entry-wise correlations as 
(Xf , V<=,X^), that is 

|2 

/3i,^(Si), (109) 



\E[uai,\~si] 

A,i(si)cr^^(si) 

2 

^2,z(Sl,S2)- (110) 



\E[uaL\Sirs2] 



^2,j(si,S2)CT^.(si) 

The upper bounds are achieved by this choice of distribution and notations. We present the calculation only for 
As in ( llOll i. using the channel model and the Markov relations, we have that: 

/(Xi; YIX2, U, s, Suh) = /i(Gi(s)Xi + Z|U, s, S^) - h{Z) 

= h(Gi{s)Xi + Z,U|s,Ii) - h(U\Si) - h{Z). (Ill) 
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Clearly 



N 

MU|Si) = iog(M^n^&.(^i)) 



i=l 

Therefore, it is left to evaluate 

/i(Gi(s)Xi + Z, U|s, h) = log [{TTef^ h 
where S(s, si) is a block matrix of the structure 

' Gi(s)SxiXi(si)Gj(s) + I Gi(s)ExiC/(si) 



E(s,si) 



S(7i7(si) 



After some algebra it can be shown that: 

/ N \ / ^ 

n'^^.(*"i) ■ n [\9^As)\^PiASi)PiASi) + 1 



2 X l5i,i(s)P-Pi,i(si)/3i,i(si)o-?f,(si) 



(112) 
(113) 

(114) 

(115) 



(116) 



Substituting (11 16l l along with (II 12l i and (11 13l l into (II 1 11 1 and summing the mutual information terms over all state 
triplets {S, Si, S2) = (s, si, §2), we achieve the upper bound for i?i. In a similar manner, all other upper bounds 
can be shown to be achievable. This characterizes the maximization problem defining the capacity region for the 
diagonal vector Gaussian FSM-MAC with partially cooperative encoders and delayed CSI. Note that through this 
proof we have shown the optimality of the Gaussian multivariate input distribution for this model. 

The cautious reader must have noticed, however, that the obtained maximization problem is not convex. In order 
to convert it to a convex maximization problem, we substitute 



7i,i(si) = /3i,i(si)A,i:(si), Vsi e S 

72,i(si,S2) = ;S2,j(si,S2)^2,i(si,S2), V(si,S2) G . 



(117) 
(118) 



for every i E {1, . . . , N}. This substitution yields the rate bounds given in dSTTl-dS?]! and concludes the proof. 



B. Two-State Scalar AWGN Channel Example 

To gain some intuition on the capacity region of the MAC with partially cooperative encoders and delayed CSI 
we now consider the scalar Gaussian channel with only two possible states. The scalar channel corresponds to 
taking = 1 in the diagonal vector channel definition described in ( |79] l. We denote the two possible channel 
states by G and B (where G stands for 'Good' and B for 'Bad'), thus S = {G, B}. The two states differ in their 
associated channel gains. When S = G, the gains are gi{s = G) = §2(3 = G) = gc, whereas when S = B the 
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gains are gi{s = B) = g2{s = B) = gs- We assume without loss of generality that go > gB- The Markov model 
of the state process is illustrated in Fig. |4] 




6 



Fig. 4; Two-state AWGN channel. 
The state process is specified by the the transition probability matrix: 

\^ P{G\B) P(B\B) j \ g l-s ] 
which induces the following stationary distribution: 

7r=(.(G) .{B) ) = ( ^ ^ )• (120) 

We start by examining the impact of the cooperation links capacities, C12 and C21, on the capacity regions. We 
particularize here to the case of symmetric CSI delays, i.e. di = d2 = d. Note that since di = d2, it immediately 
follows that 5i = 5*2 = S. The capacity region is presented in Fig. |5] for three different cases: (a) the case 
of symmetrical capacities, i.e., C12 — C21, (b) the case of a single cooperation link, i.e., C12 > C21 = and 
(c) the case of one infinite cooperation link, i.e., C12 < C21 — 00. This is done by numerically solving the 
maximization problem from Theorem |3] for the above three cases using CVX BTl . Throughout this example we 
assume Vi = V2 = 10, gs = 0.01, go ^ ^, g = b = 0.1 and d = 2 (results of similar nature were also observed 
for gB = 0.2 and gB = 0.3). 

Note that in Fig. 13 a), which presents the region for the symmetrical case, as C12 = C21 grows without bound, the 
capacity region increases and eventually takes the shape of a triangle. This is since the first three constraints on the 
rates (i?i, R2), as given by dSTTi-dSSTl, grow without bound as well, and thus the binding constraint is the sum-rate 
constraint of (l84ll.For the single cooperation link case in Fig. [Sjb). the value of R2 remains fixed as C12 grows. 
This is due to the fact that the constraint on R2 in ( |82] | does not change with C12 and stays fixed at approximately 
0.9642. Finally, for Fig. IHc), which presents the case of infinite cooperation link capacity C21 = 00, we have that 
the constraint on R2 in ( |82] | and the first constraint on the sum rate in (l8Jt . are both redundant. Hence, the only 
meaningful constraint on R2 is ( l84l ). which does not involve C21 (or C12). 

Next, we demonstrate the fact that the capacity region of this setting grows as the cooperation link capacities 
grow, regardless of the specific assumptions on the relation between the delays of the CSI available at the encoders. 
In order to do so, we present the maximum sum rate versus the cooperation link capacities for three different 
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Capacity Region for Symmetrical Capacities Ci'2 — C21 and Delay d — 2 

1.5^ ' ' 1 




possible relations between the delays: (a) di = d2 = 2, (h) 2 = d2 < di = oo and (c) 2 = di > d2 = 0- For all 
three cases we assume C12 = C21 and use the same values of the channel parameters as before. The curves are 
shown in Figs. |6ia)-(c). 

As expected, the sum rate for the case of asymmetrical delays (which has the best CSI properties of the three) 
reaches the highest value as the capacities grow, whereas the sum rate for the infinite delay case (which has the 
worst CSI properties) reaches the lowest value. Moreover, we note the correspondence between Fig. |6ja) and Fig. 
ma) (both corresponding to the case of symmetrical delays and equal cooperation link capacities). This manifests 
itself in the fact that in both figures, as C12 — C21 grow without bound, the sum rate approaches its maximal 
value, which is approximately 1.5 bits per symbol. 

Another interesting aspect of the Gaussian channel example is the effect of signal to noise ration (SNR) on the 
correlations between the auxiliary RV, U, and the RVs Xi and X2- These correlations are associated with the degree 
of cooperation used in the scheme. We assume Pi = P2 = P and gi = g2 = 1, thus the SNR in fact equals to 
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01.35 



Symmetrical Delays di = ^2 = 2 




0.5 1 1.5 

Ci2 = C21 [bits/symbol] 



(a) 



Infinite Delay 2 = d2 < di = od 



A-Symmetrical Delays = 1^2 < (ii = 2 



+ 



C12 = C21 [bits/symbol] 
(b) 



C12 = C21 [bits/symbol] 
(c) 



Fig. 6: The sum rate versus the cooperation hnk capacities C12 — C21 for three different cases of delayed CSI: (a) 
symmetrical delays, d2 — d2 — 2; (b) infinite delay, 2 = c?2 < c^i = 00; (c) asymmetrical delays, = o?2 < c^i = 2. 
The dashed line corresponds to the case where C12 = C21 — 00. 



the transmission power P. In order to examine the effect SNR on the correlations we restrict ourselves to the case 
where \S\ = 1, i.e., a single and constant channel state II37I . Throughout this analysis we use the same notations 
and expressions for the rate bounds as in (37\. Note that for the case where |5| = 1 the original maximization 
problem turns out to be concave; thus, no transformation is needed. The only variables in the original maximization 
problem are f3i and (32, which are defined through 



E[UXi 



Pl, - 



E[UX2 



P2- 



(121) 



We consider the case of symmetrical cooperation link capacities, i.e., C12 ~ C21. By the symmetry of the 
maximization problem in (/3i,/32), optimality is achieved when /3i = /32. For this reason we use the notation 
(3^^ 13 and plot a single curve representing both correlations (which are calculated directly from j3 according 
to (II2II 1). The numerical results are shown in Fig. |7] The blue and green dashed lines designate the asymptotic 
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Correlations vs. SNR for C12 = C21 = 



CoiTelatioiis vs. SNR for C12 = C21 = 0.1 
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Correlations vs. SNR for C12 = C21 = 0.2 
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Correlations vs. SNR for C12 = C'21 = 0.4 
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Fig. 7; Correlation as a function of SNR for different values of the capacities C12 — C21. 



value of the correlation and the critical SNR in which the correlation drops form unity, respectively. Results are 
shown for six different values of C12 = C21. 

Although the effect of the SNR on the correlations could not be calculated analytically, we use asymptotic 
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evaluations in order to gain some additional insight. Namely, we demonstrate that the optimal correlation admits 

1 , SNR < SNR'^"* 

P* = { , , (122) 

1 - 2-^ia,Ja,o + i ' SNR^OO 

Where SNR^"' = lOlogiQ f 2^1^12^^^) [dB]. 



We start by justifying the observation that the correlation approaches 1 for small SNR values. For some positive 
value of Ci2 = C'21, and for and for Pi,P2 < 1, consider (cf. (l83Tl-(l84li): 

i?i + i?2 < min { i log (1 + l3{glP^ + gjP^)^ + C12 +C21 , i log (1 + 5?Pi + giP2 + 2^152 V/?' A P2) } 



= - log [1 + gtPi + 52P2 + 2gi52 V/3'A^2 j (123) 

Now note that that the last term in (|123l l is maximized for (3* = 0, which, in turn, implies that the correlation is 
equal to unity. As shown in Fig. [T] for smaller values of SNR the correlation is indeed higher, as if the scheme 
aims to compensate for the low SNR using cooperation. 

The asymptotic evaluation for low SNRs is valid up to some critical SNR value in which the correlation drops 
from its maximal value of 1 . We calculate this critical SNR next. We denote the SNR value of interest by SNR^"^'* 
and define it as 

SNR^"* = sup {P : = 1}. (124) 

In order to calculate SNR*^"^'' we restrict the analysis to the segment of SNRs in which the correlation is maximal 
(or equivalently, /3* = 0) and consider ( 1123b taken for Pi = P2 = P and gi = 92 = 1 in. As shown in ( I123l l. 
when (3* — and P = 0, the second logarithm achieves the minimum between the two terms. Fixing (3* — 
and increasing P increases the second logarithm in (1123) while the first term remains unchanged and equals to 
C12 + C21- As long as 



i log (1 + 2P + 2^P 



, , . <Ci2 + C2i (125) 

holds, the optimum in achieved for f3* = 0. However, when (1125b is no longer valid, the optimal value of f3 must 
vary from 0. Thus, calculating SNR*"^"* reduces to solving the following equation: 



i log (1 + 2P + 2/3P 

yielding. 



= Ci2 + C2i, (126) 

/3=0 



22(Cl2 + C2l) _ 1 

SNR°"* = . (127) 

The value of SNR^"^'* [dB] is represented by the perpendicular dashed green line in the plots shown in Fig. [Tj 
again the numerical calculations meet the analytical results. Note that as the capacities C12 — C21 grow, so does 
the value of SNR^"^'*, and hence the transition between the low- and high-SNR regimes occurs at a later stage. 

As the SNR grows, the correlation asymptotically approaches some value in the interval (0, 1). In order to find 
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this asymptotic correlation, we present the following analysis for the high-SNR regime (assuming Pi,i-2 ^ !)■ 
We start by excluding /3* = as a possible solution for this case (a fact which will be used subsequently). Fixing 
Ci2 = C21 and substituting /3 = into the sum-rate bounds on Ri + R2 yields (cf. (l83] )-(l84b): 

i?i + i?2 < mill I C12 + C21 , ilog(l + .g?Pi+.giP2 + 2.gi.g2\/Ty^) } 

= C12 + C21 (128) 

where (a) follows form the fact that Pi,P2 ^ 1- 

Thus we get that for an infinite SNR, by taking (3 — the sum rate is bounded by the sum of capacities. However, 
since C12 + C21 is a constant which does not depend on the powers Pi and P2, we conclude that (3* cannot be 
equal to zero. 

Finally, assuming /?* > we calculate /3* by using some approximations which are easily justified at high SNR. 
First, note that the first logarithm in (1123) is monotonically increasing in j3 whereas the second is monotonically 
decreasing in f3. This implies that the optimum is achieved at the value of /3 in which the functions intersect, that 
is 



- log [l + /3{gfPi + giP2) j + C12 + C21 = - log [l + gfPi + g'^P2 + 2gig2^ (3^ P1P2 ) . (129) 
Using the fact that for high SNR we have: 

i log (1 + PiglPi + glP2)) + C12 + C21 « i log (l3{glPi + glP2)) + C12 + C21, 
i log (1 + glPi + glP2 + 25152 V^^p^p^) ~ \ log (glPi + glP2 + 2gig2^Wp^2 
the equation in ( |129t reduces to: 



^(5? Pi + 5?P2)22(^-+^-) ^ ^2p^ ^ g2p^ ^ 25152 Pi P2 . 

In order to further simplify the analysis we again assume a unit channel gain, that is 51 = 52 = 1. After some 
algebra we obtain that the intersection point is given by 

P 22(Ci2+c.i)(Pi + P2) + 2VP^ 



by taking Pi = P2 = P, (11301) reduces to 



Therefore the optimal correlation, p*, at infinite SNR is given by 

The value of p*, for each value of the cooperation link capacities C12 and C21, is represented by the horizontal 



34 



dashed blue line in the plots shown in Fig. [T] Note that the numerical calculations indeed meet the asymptotic 
results for large values of SNR. 

To conclude, we interpret the numerical and analytical results in terms of the optimal transmission strategies of 
the users for each SNR regime. Recall that the symbols of the codewords transmitted by the users are modeled by 
the RVs Xi and X2- The fact that for low SNR the correlation is at its maximal value of unity implies that both 
users tend to transmit the same codewords; this, in turn, indicates that they transmit the same message. However, 
the only common information the users share is the common message that they have created using the conference. 
Therefore we conclude that when the channel quality is low, the best strategy for the users is to transmit only the 
common message and forfeit their private messages (i.e., the parts of their original messages which they have not 
managed to share). As the SNR grows beyond SNR*"^"', the correlation between the code symbols decreases to 
some positive value p* E (0, 1), asymptotically approaching (|132p . This is since when a higher quality channel 
is experienced, each user transmits not only the common (correlated) message but also his private (uncorrected) 
message. 

Alternatively, by inspecting the tending of the optimal correlation from the rate perspective some additional 
intuition unveils. As long as the transmission sum-rate admits Ri + R2 < C12 + C21, the transmissions consist 
only of the correlated common message; namely, the users are fully cooperative. Once the sum-rate surpasses the 
sum of communication links, i.e. Ri + R2 > C12 + C21, the transmitted code symbols integrate both common and 
private messages which causes the optimal correlation to drop. 

VI. SUMMARY AND CONCLUDING REMARKS 

In this paper we have considered the FSM-MAC with partially cooperative encoders and delayed CSI and derived 
its capacity region. The achievability proof used another result of this paper, namely, the capacity region of the FSM- 
MAC with a common message and delayed CSI. The latter result was obtained using rate splitting, multiplexing and 
simultaneous decoding. This approach circumvents the need to rely on the capacity region's corner points, which 
becomes cumbersome when their number is large. Furthermore, it is easily extendable and, therefore, establishes 
the base for simultaneous decoding schemes involving multiple users. 

The general conferencing result was then applied to the special case of the Gaussian vector MAC with diagonal 
channel transfer matrices, which models OFDM-based communication systems. The capacity region is presented 
in the form of a convex optimization problem and its establishes the optimality of Gaussian Markovian inputs. 
This result serves as a generahzation of ^j] to the vector state-dependant case. Focusing on a two-state Gaussian 
FSM-MAC example, the crucial role of cooperation for low SNR values was demonstrated. 

Extensions of the results for the Gaussian vector FSM-MAC to general MIMO settings (see, e.g., (3E]), as well 
as the ISI channel, are currently investigated. 
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Appendix A 

PROOF OF THE MARKOV RELATION IN f25h-(f27h 
We prove the Markov relation (I25t-(l27t using the following claims. The equality in (IZSl ) follows from the fact 
that {Mo, Si-'''-^) - S,-d, - Sg^d, - Sg and so is (Mq, S?-'''-\Q) - Sg-d, - Sg-d, - Sq. 
To show ( |26l ) consider the following relations 

p{xi^q\Sq, Sq^di , Sg-ds, "g, Q) = P{xi,q\Sq, Sq^di , Sq^d^ , "^0, s'^"'^'"^ , Q) 

= X! P{mi,Xl,q\Sq,Sq^di,Sq-d2,ma,s'^^''^~^,q) 
nil eA4i 

= ^ p{mi\sq, Sq-di , Sq-d2 , iTiQ, s''"''^"^ g)p(a;i,g |sg, Sq^di , Sq^d2,mo, mi,s'^~ 

mi eMi 

=^ X! p{mi\sq-di,mQ,s'^^''-^^^ ,q)p{xi^q\sq-di,mo,mi,s'^^'''^^^ ,q) 

mi GMi 

= 51 P{'n^i^Xi^q\sq-di,mo,s'>^'^^^^,q) 

mi £A4i 

= p(xi,g|s,_<jj,mo,s''"'^'"\q) (133) 

where (a) follows from the fact that Mi is independent of (AfojS'") and the fact that Xi^q is a deterministic 
function of {M^, Mi, Sq^di, S'^^'^^^^). Now, since this is true for all q, and because the auxiliary RV is defined 
as [/ = {Mo,SQ-'^^-^,Q), it holds that 

P{xi\s,si,S2,u) = P{xi\si,u). (134) 
Finally, to show (|27] | we use the following relations 

P{x2,q\xi,q,Sq, Sq-di , Sq-d2,Uq, q) = p{x2,q\xi^q, Sq,Sq-di : Sq-d2 7 "^0, s'"'^'"\ <?) 

= X! Pi'n^2,X2,q\xi,q,Sq,Sq-di,Sq-d2,fno,s'^^'^^^'^,q) 
m2£M2 

= X! P(™2|a;i,g, Sg, Sg_di, Sg_d2, mo, s''~'''~\ q)p(x2,g|xi,g, Sq, Sq^di, Sq-d2,fno, m2,s''~'^^'^ , 

m2eM2 
mieMi 

= X! P(™2,a;2,g|sg_di,Sg-d2,TOo,s*~'''~\g) 

miEMi 

= P(a;2,q|Sg-di , Sq-d2,mo, s''~''i~\ q) (135) 

where (a) follows from the fact that M2 is independent of {Xi, S'") given Mq and the fact that X2,i is independent of 
iXi,q, Sq) given (Afp, A#2, Sq-di, Sq-d2, S'^^'^'^^^). Again, since the above holds for every q, and by the definition 
of the RV U, we conclude that 

P{x2\xi,S,Si,S2,u) = P{X2\SI,S2,U). 
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Appendix B 

ANALYSIS OF THE PROBABILITY OF ERROR IN THE ACHIEVABILITY PROOF OF THEOREM^ 
First, we analyze the probability of an encoding length mismatch for the component codewords of the sub- 
messages mosj, misj or m2Si.s2^ i-^-' < ni{si)) and P(A^jj.j2 < n2{si, 32))- Since the state process 
is stationary and ergodic, lim„^oo ^^^^"^ = ^(^i) lim„^oo ^'^'^^^''^^^ = P{si,S2) in probability. Therefore, 
P(A^sj < 7i2(si)) — ?> and P{Ns-^^s2 < "•2(51, S2)) — > as n — > 00. Therefore, the probability of a decoding error 
being induced by a mismatch between the actual number of channel state realizations and the expected lengths used 
in the encoding stage goes to zero. 

Next, we analyze the probability of a decoding error Without loss of generality let us assume that for a given 
Si = si, the k + 2 sub messages of interest that were sent are (mogj , mi^^ , m2(si)) = (1,1,1), where 1 is a 
/c-dimensional vector of I's. First, we note that an error in the decoding of X2^'"'^^\mos-^ , m2(si)) can occur as a 
result of an error in any subset of the component reconstructed sub-messages of m2(si), i.e., in any subset of the 
set {m2,si,i, TO2,si,2, W2,si,fe}- In order to deal with such an event, we define the following two sets: 

St = {h e S : to2siS2 = 1}, (136) 
Sf=S^ ^{heS ■.m2srs2^'^}- (137) 

St is the subset that contains all the states §2 E S for which no decoding error occurred and the reconstructed 
sub-message is correct (here the subscript T stands for 'True'). Sf is the subset that contains all the states S2 E S 
for which a decoding error occurred and the reconstructed sub-message is incorrect (here the subscript F stands 
for 'False'). 

We note that if, for some si e S, and error occurred in all reconstructed sub-messages m2,si,s2' where S2 € S, 
we have St = 9 and Sp — S. If no error occurred we have Sp ~ and Sp = S. Moreover, we note that the 
reconstructed codeword X^'''"'''''^ can be now described as function of the common sub-message niosi, the two sets 
( I136l l- (I137| | and a set of \Sp\ indices qg^- The latter set states for each sub-message m2SiS2 7^ 1- where S2 G Sp, 
which is the appropriate index, q^^ £ {2, 3, . . . , 2"2(«i'«2)fi'2(si,s2)|^ which it equals. Therefore, henceforth we 
use the notation X^^^""^-* (rriosi , v^^), where vg^ — {v{l),v{2), . . . ,v{k)) is a fc-dimensional vector of indices such 
that 



Note that for every Sp 7^ we have v^^ 7^ 1. 

In addition to these definitions we introduce the following lemma. 

Lemma 6 Let Pxy{x, y) denote the joint distribution of two RVs (X, Y) on X x y, let Px{x) and Py (y) denote 
their marginal distributions and let n = ni + n2. Further, let the sequences (a;"^?/"^) and (x"^,?/"^) be drawn 
in an i.i.d manner according to PxY{x,y) and Px{x)PY{y), respectively. Let (a;",?/") be a concatenation of 
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and (x"^y"^). Then, 

(1 - (5;,J • 2-"^(^(-^'^)~''-) < P{(a;",y") e r,(")} < 2-"^(^(^''*')+'5»), (139) 
where (5^ -> ant/ (5^ „ — > fli e — > 0. 

The proof of Lemma |6] is given in Appendix |E] Lemma |6] gives rise to another observation. 

Lemma 7 Let Pxywu{x, y, u, w) denote the joint distribution of the tuple ofRVs (X, Y, U, W) on X xy xU x W. 
Let PxY\wu{^Ty\w,u) denote the conditional joint distribution of X,Y given U,W, let PxiWf/Cs^lui, u) and 
PyHyt/(y|''^7 denote the corresponding marginal conditional distributions of X and Y. Define — Pw{W — 
w)-n, clearly n = Y.weiW^w- Let Wf = {w : PxY\uw=w = Px\uw=wPy\uw=w}- Let {(a;"™ , y""')}^^.^ 
be a set of sequences which are drawn in an i.i.d manner according to Px\uw=wPy\uw=w when w e Wf, 
and according to PxY\uw=w when w G Wp- Let be a concatenation of the set of sequences 

{(a^"™,2/"™)}^gvv Then, 

(l-^^„) •2~^™^«^ "™'^'^'^l'^'^="'^"''=^ < P{(a;",y") G T.^")} < 2" ^™^>vf (140) 
where and (5^ „ — > ai e — ?► 0. 

Lemma [T] follows directly from Lemma |6] and its proof is therefore omitted. Note that even though both lemmas 
refer to sequences (x",?/") that are created by concatenation, the results can be generalized to sequences (a;", y") 
that are created by any mixture of the sub-sequences (and not necessarily concatenation). 
Next, for simplicity, we use the notation in (l33Tl-(l38Tl and introduce the following 

ni(Si) = ni, (141) 

n2{si,S2 ^ £) ^ n2t- (142) 

Note that, 

n2e = niPi, (143) 

for a proper choice of e' and e" (namely, e" — e'Pi). 

Using these notations we can define the events that correspond to all possible decoding errors (recall (l2Fi ). We 
start with the events in which no errors occurred in the decoding of the sub-messages of m2(si) (i.e., in these 



events m2(si) is presumed to be correct): 

El = {{U"'{l),X^'{l,l),X'^''{l,l),Y"\S"\S!^') i 7;("i)|5i = Si}, (144) 

= {3z ^ 1 : ({/"i(*),^r(«,l),^2"n*,l),>^"\^"S^r) eT.^"^'!^! =Si}, (145) 

= 1 : (C/"Ul),^r(l,j),^2"Hl,l),i""S^"S52"^) GT.^^^^ISi =Si}, (146) 

E^ = {3(*,j) ^ (1,1) : (;7"n»),^r(*,j),^2"^(»,l),i""S5"\5^^) e r,("^)|5i - Si}. (147) 



38 



Next, we present the events in which m2(si) is fully or partly incorrect. For a given set Sp C S, let us define: 

E^iSp) = {3{qsAs,es,,^s, ^ 1 : (C/"^ (1), X^"^ (1, 1), X^'^l, V5 J, G r/")|5i =5i}, (148) 

(149) 
(150) 

E^{Sp) = {3(z,j) ^ (l,l),3teJ,,e5.,V5, ^ 1 : (f/"HO, J'), ^2"^*, v^,), e r/")|§i 

(151) 

Using the union of events bound, we can bound the probability of a decoding error by: 

^ i=l SfCS 1=5 1=1 5fC5 i=5 

We need to show that for the coding scheme presented in Section BlI-BI and for a rate triplet {Rq, R2, R2) as 
given in Theorem [T] Pe — > as n — )• 00. 

1) P(£'i) — > as rii — > 00 from the law of large numbers. 

2) In order to upper bound ^{E^) consider: 

2"i«i 



p(£;3) - E 



J=2 

^ 2"i(-R'i~-f(^i;'^l^2,C/,S,Si=si,S2)+<5,) 

So in order to have PiE^) — > as m — > 00 the following must hold: 

R[ < I{Xi;Y\X2, U, S, Si = h, S2). (153) 
We proceed with the upper bounding of ( |148t , (1149) and (1151) , for a given set Sp C S. 

3) In order to upper bound P(^Er,{Sp)) for a given set Sp, first consider: 

P{E5{Sp)) 2"^«e'5F . Y[ 2"-'''^^' 

= 2~^feSF ■^2e{R2t-I{X2;Y\XuU,S,Si=SuS2=e)+S,) 
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(154) 



where (a) follows by applying the union bound and Lemma [T] This is because the errors of interest occur 
in the component codewords X2^' {1, qg), for every qi £ {2,3, ... ,2'^^^^^'^}, where £ £ Sp- According to 
Lemma H the probability of each of those errors is upper bounded by 2-''^'i'^^^-^\^^-^-^-'^^=^^-^^='^^+^^) . 
Note that here the codewords and f7"^(l) are fixed and correct for all Sf ^ S, which explains the 

structure of the mutual information term in (I154t . Therefore, in order to have F[E5{Sf)) as ni -> oo, 
the following must hold, 

PiR2i < PiHX2; Y\Xi,U, S, Si = Si, ^2 = £), (155) 
for every possible set Sp C S. 

4) In order to upper bound V[^Eq{Sf)) for a given set Sf, we have 

f{EQ{SF)) '< 2"^'!eSF"24^(^i'^2;'*'l^"S'"Si=^^.'52=^)+5,) 

^ 2"iK+T,ieSf ^2iR2i-T.teSp n2i{l(XuX2;Y\U,S,Si = ~Si,S2=e)+&.) 
, 2-^teST n2t{l{Xi;Y\X2,U,S,Si=si,S2=t)+S,) 

^ 2'^i{K+j:eeSp Pi{R'2i-I(XuX2;Y\U,S,Si = ~si,S2=e)-S,)-Y.tes.r Pt{l(Xv,Y\X2,U,S,Si=st,S2=t)+5,) 

(156) 

where (a) follows by applying the union bound and Lemma |7] First, note that if the component codeword 
where j € {2, 3, . . . , 2"i-'^i}, is incorrect, then all the sub-component codewords for 
every £ £ S (and in particular for every £ £ Sf), are also incorrect. Moreover, errors also occur in the 
component codewords qi) for every qi £ {2, 3, . . . , 2"^*-''^'} and £ £ Sf- According to Lemma |7] 

the probabihty of each of the errors for ^ e 5f is upper bounded by 2-''^'i''-^^-^'''^\^'^-'^^='^-'^^=^^+^') 
(here both X"^* and X^^' are incorrect), whereas the probability of each of the errors for i e 5t is upper 
bounded by 2-"2t(^(^i;i'l^2,(7.s,Si=si,S2=t)+5.) ^^^^.^ ^^^^ ^n2t incorrect). Finally, note that the codeword 

?7"i(l) is fixed and correct for all Sp Q S, which explains the structure of the mutual information terms in 
( I156I) . Therefore, in order to have P(^Eq{Sf)) — ^ as rti — ^ oo, the following must hold, 

R[+ Y PeR^e < Y PiHXi,X2; Y\U, S,Si = rsi,S2 = £)+Y PtI{Xi;Y\X2, U, S, Si = h,S2 = t), 

(157) 

for every possible set Sp C S. 
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The inequality in ( |157t can also be rewritten as, 

R[ + P^R^t < J2 ^^(^(^2; Y\U, S, Si = Ii, §2 =i)+ IiXi;Y\X2, U, 5, Si^h,S2^ £)) 



+ J2 PtIiXi;Y\X2, U, S, Si = si,S2 = t) 

= J2 Pd{X2; Y\U, S, Si =rsi,S2 = e) + J2 PtHXi;Y\X2, U, S, Si = h,S2 = t) 
= IiXi;Y\X2,U,S,Si = §i,S2)+ J2 PiI{X2;Y\U,S,Si = Si,S2 ^ I) (158) 
5) In order to upper bound ^{E%{Sf)) for a given set Sp, first consider: 

ies 

^ 2ni[R'a+R'^+R'2-I(U,Xi,X2;Y\S,Si=Si,S2)+5e) 

where (a) follows from the union bound and from the fact that there is an error in the component codeword 
where i G {2, 3, . . . , 2"'^"}. This is because an incorrect common sub-message mos^ 7^ 1, for 
si e S, causes an error in all other component codewords. Namely, errors occur in the component codeword 
X^^ {i, j), for every j e {2,3,..., 2"i-'^i }, and in each of the component codewords Xj'^* {h<lt), for every 
g£ € {2, 3, . . . , 2"2^^2*} when I e Sf, and for qi = \ when t E St- In other words, an error in the common 
sub-message will cause all three codewords to be incorrect for every £ E S. Therefore, the probability of 
error for the whole block (of length ni) is upper bounded by 2"i (-^(^•^i-^^^^l^-^i^'^i'^^^+'^O , and in order 
to have ¥{Es{Sf)) ~^ as 71 — > 00, the following must hold, 

R'o + R[ + R'2 < I{U,Xi,X2;Y\S,Si ^ Si,S2), (159) 



for every possible set Sf Q S. We rewrite the inequality in il59\ as, 

R'o + R\ + R'2 < I{Xi,X2]Y\S, Si ^ Si, 52) + /([/; Y\Xi,X2, S, Si = si, S2) (160) 
'^^ IiXi,X2;Y\S,Si^~si,S2), (161) 

where (a) follows from the mutual information chain rule and (b) follows from the fact that Y is independent 
of U given {Xi,X2, S), by the underlying channel model (see Subsection III- Al l. 

Note that the restrictions that arise from upper bounding the probability of the events in ( I145l l, (I147l i and 
(1150b are all redundant given the restriction that is obtained by upper bounding the probability of the 
event in (I151l l presented above. This is because in all these events the codeword [/"^ is incorrect, which 
causes the codewords X"^ and to be incorrect as well. Thus, the restrictions that must hold in order 
for these probabilities to go to zero contain the same mutual information term as an upper bound on the 
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corresponding rates. However, upper bounding the probability of dlSll l yields a restriction on Ro+ Ri+ R2, 
whereas upper bounding the probabilities of (1145) , (1147) and ( 1150b yields restrictions on Rq, Rq + Ri 
and Rq + R2, respectively. Due to the fact that Rj > for all j e {0, 1,2}, the three latter restrictions 
become redundant. For this reason the upper bounding on the probabilities of (I145l l, ( I147l i and ( 1150b is omitted. 

Summarizing the above results, we get that the probability of error, conditioned on a particular codeword being 
sent, goes to zero if the following conditions are met: 

R[ < I{XuY\X2, U, S, S, = Si, §2), (162) 
^ PeR2e < PiHX2] Y\XuU, S, Si = 81,82 = £), (163) 

R[ + PiR2i < I(Xi;Y\X2, U, S, Si = Si, ^2) + ^ PiI{X2; Y\U, S, Si = h,S2 = £), (164) 

R'„ + R[ + R'2 < I{Xi,X2;Y\S,Si = Si,S2), (165) 
for every possible set Sp C S. 



Appendix C 

FOURIER-MOTZKIN ELIMINATION FOR THE PROOF OF THEOREMIH 
In this appendix we use the FME to show that the set of inequalities: 

J2 P^^^i < J2 Pd{X2;Y\Xi, U, S, Si = Si, 52 = £), (166) 
R[ + J2 P^P^i < HXi;Y\X2, U, S, Si = Si, ^2) + ^ PiI{X2;Y\U, S, Si = Si, ^2 = i), (167) 
for all Sp C S, are equivalent to the two inequalities, 

R'^ < I{X2; Y\Xi, U, S, Si - Si, §2), (168) 
R[ +R'2< I{Xi,X2]Y\U, S,Si^~si,S2). (169) 

Recall that the state space is 5 = {l,2,...,fc}. For simplicity, we henceforth denote an arbitrary subset of S 
by A rather than Sp as in (I166b -( ll67b . Denote by the compliment of A in S, that is Sa = A'-^ (where the 
compliment is taken with respect to the state space S). Specifically, Sg^^g^^,,,^^^ = S\{ti,t2, ■ ■ ■ Ap] where £i E S 
for all i G {1,2, ... ,p} and p < k. Moreover, throughout this appendix we use the notations in (|33]|-(|38]| and 
(I141b - (ll42b . Accordingly, we have: 

i?^=^P,i?2^. (170) 
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In addition, since all involved mutual information terms are conditional on ([/, S, Si = si), we use the following 
shortened notation: 

I{Xi;Y\X2,U,S,Si ^ h,S2) ^I{Xi;Y\X2), 
/(X2; [/, 5, Si = Si, 52 = ^) = MX2; Y\Xi), (171) 

For clarity, we split the FME process into two stages where the second stage serves as an inductive stage that 
concludes the proof. 

Stage 1: In this stage we eliminate the partial rate R21 (and all its associated components) from the set of 
inequalities (I166l l- (I167| | in order to remain with inequalities for subsets Si only. 

First, consider the inequality (I166t for an arbitrary subset B C S that contains the element {1}. The subset B 
can be represented hy B — .41J{1} for some A C Si. Using ( 1170b . we rewrite the left hand side (LHS) of this 
inequality as follows: 

PiR2i (172) 

Using similar arguments, for the right hand side (RHS) of (1166b . we have that (see (1171b ): 

Y,PiIi{X2;Y\Xi) =I{X2;Y\Xi) - ^ P,It{X2;Y\Xi). (173) 

leB leB^ 



However, B'~^ C Si, thus R21 is eliminated from all inequalities of the form (1166b . 

A similar procedure can be applied to the inequaUties of the form ( 1167b . taken for subsets B C S containing the 
element {!}. The LHS is rewritten using (I170t . whereas for the RHS consider: 

IiXi;Y\X2) +J2PiMX2;Y) ^I{Xi;Y\X2) +IiX2;Y) ~ ^ PiIi{X2;Y) 
ieB £eBC 

= I{Xi,X2;Y)- ^ P,Ie{X2;Y). (174) 

ieB^ 

By doing so R21 is eliminated from all inequalities of the form (1167b . and thus we are left with inequalities of four 
different forms: 



J2 ^^^2£ < PiMX2;Y\Xi), (175) 
ieA eeA 

R'2-Yl P^R^^ < AX2:Y\Xi) - PMX2; Y\Xi), (176) 
leA leA 

R'l + Y PiP2e < IiXi;Y\X2) + Y PMX2; Y), (177) 
leA eeA 

R'l +R'2-Y < 1(^1, ^2; Y)~Y PiMX2;Y), (178) 



eeA eeA 

for every A C Si. 
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Stage 2: Next, we eliminate the partial rate R22 (and all its associated components) from the set of inequalities 
(I175b - (ll78b in order to remain with inequalities for subsets A C 5i 2 only. In order to do so, we divide the 
remaining inequalities into two classes. The first contains all the inequalities of the forms ( I175b -( ll78b taken for 
subsets A C 5i_2, that is subsets A that do not contain both elements {1,2}. The second class consists of all 
inequalities for which the subset contains the element {2}. We denote such a subset by S C 5i (with some abuse 
of notation), where B = A[J{2} for some A C Si^2- Using the FME we eUminate the partial rate i?22 and all 
its associated components from these inequalities. By doing so, new restrictions arise. However, we show that 
these restrictions are redundant as they are all admitted by having the inequalities of the first class (namely, the 
inequalities of the forms ( I175b - (ll78b taken for subsets A C 5i^2)- 

By separating the components which involve the element {2} in the second class of inequalities, the latter can 
be rewritten as: 

P2R22 + ^^^2^ < ^2X2(^2; + PeTi{X2;Y\Xi), (179) 

leA eeA 

R'2 - P2R22 ~ Y ^^^2^ < AX2;Y\Xi) - P2l2{X2;Y\Xi) - Y PeMX2;Y\Xi), (180) 
eeA eeA 

R'l + P2R22 + ^^^2£ < I{Xi-Y\X2) + P2l2{X2;Y) + Y PeMX2; Y), (181) 
eeA eeA 

R'l + R'2- P2R22 - Y ^^^2£ < AXi,X2; Y) - P2l2{X2-, ^) " II PeIe{X2; Y), (182) 
eeA eeA 

for every A C Si,2 (since S\{2} ~ A, for some A C 5i.2). 

For the inequalities of the forms (I179b - (ll82b . we apply the FME through the following steps: 

1) Combining (fT79]l-(fT80b taken for some subsets A ,A C 61^2, respectively, we get: 

R'2-Y ^^^2£ + Y ^^^2^ < I{X2;Y\Xi) - Y PeMX2;Y\Xi) + Y PeW2; Y\X^). (183) 
eeA" eeA' eeA" eeA' 

But the above inequality can be constructed by adding ( 1175b taken for the subset A to ( 1176b taken for the 
subset A ■ We thus conclude that the restriction in (1183b is redundant. 

2) Next, combining (I181b - (ll82b taken for some subsets A ,A C Si,2, respectively, we get: 

R[ + Y PeR2e+{R'i +R2)- Y -P^^2€ 
eeA' eeA" 

<IiXi,X2;Y)+ Y PeIe{X2;Y) +I{Xr,Y\X2) - Y PeMX2;Y). (184) 
eeA' eeA" 

The above inequality can be derived by adding (1177b taken for the subset A to (1178b taken for the subset 
A , and it is thus also redundant. 
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3) By combining ( 11791 ) and ( 11821 ) taken for some subsets A ,A C 5i_2, respectively, we get: 

PeR2e + R'l + -^2 ~ PtR2t 
ieA' ieA" 

< P2l2iX2;Y\Xi) + J2 PeMX2]Y\Xi) +IiXi,X2;Y) - ^2X2(^2;^ - J2 PMX2\Y) 

leA' leA" 

P2{l2{X2;Y,Xi) -l2{X2;Y)) + ^ P,I,(X2;y|Xi) +I(Xi,X2;r) - ^ PiI,{X2;Y) 

leA' ii^A" 

= P2l2{X2;Xi\Y) + PeMX2;Y\Xi) +I{Xi,X2;Y) - ^ PiIe{X2;Y) (185) 
ieA' eeA" 

where (a) follows from the fact that Xi is independent of X2 given {U, Si, 82)- Note that by adding inequality 
(11751) taken for the subset A to inequality ( 1178b taken for the subset A , we get: 

PeR2e + R'l + R2 - J2 ^^^2£< ^ PfX,(X2;y|Xi)+I(Xi,X2;r)- ^ P,If(X2;r), 
^G^" ieA" 

which is tighter than (|185l l due to the non-negativity of the mutual information. This implies that ( |185b is 

redundant as well. 

4) Finally, by combining ( 11801 ) and ( 11811 ) taken for some subsets A ,A Q Si^2, respectively, we get: 

R2-J2 ^^^2£ +R'l+ ^^^2^ 

eeA' eeA" 



< IiX2;Y\Xi) - P2l2iX2;Y\Xi) - Y PMX2: Y\Xi) 



eeA' 



+ I{Xi;Y\X2) + P2MX2;Y) + Y PeMX2:Y). (186) 



eeA 



Denoting A ' = A'\^i^j( f] } and A" = A"\^ f] A" | we rewrite ([T86) as. 



i?2— PeR2e + R'l + "Y^ P1R21 
leA' eeA" 



< IiX2:Y\Xi) - P2l2iX2;Y\Xi) - Y PMX2]Y\Xi) +X{Xi-Y\X2) + P2l2{X2;Y) 

eeA' 

+ Y PeMX2;Y) 
eeA" 

(a) 



eeA" 



' l{Xi;Y\X2) +I{X2;Y,Xi) + P2(l2{X2;Y) -l2iX2;Y,Xi)'^ + Y PeMX2;Y) 

- Y PeIi{X2:Y\Xi) 
eeA' 

= I{Xi-Y\X2) +I{X2;Y) +I{X2;Xi\Y) + P2{l2{X2-X) -MX2\Y) ^l2{X2;Xi\Y) 

+ Y PeMX2;Y) - Y PeMX2;Y\Xi) 
eeA" eeA' 
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i€S eeA" 
^ J2 PMX2;Y\Xi) 

leA' 

= i{x^,X2;Y)+ J2 PiMX2; Xi\Y) + PiMX2;Y) - PiMX2;Y\Xi) 

ees\{2} eeA" eeA' 

''^I{X,,X2;Y)+ Y P,MX2;Xi\Y)+ Y Pi{MX2;X,\Y)+MX2;Y) 
ees\{A" u {2}} ^G-^" 

- Y PeMX2;Y\Xi) 
eeA' 

i I{Xi,X2;Y) + J2 PMX2;Xi\Y) + ^ P,I,(X2; F, X^) - ^ F^I^Xa; F, X^) 
ees\{A" u {2}} ^G-^" ^e^' 

*=^I(Xi,X2;r)+ ^ P£Z,(X2;Xi|y)+ ^ P,It{X2;Y,Xx)^ ^ P,l£(X2; F, Xi) 
f65\{^"U {2}} " fe^ ' 

^=^I(Xi,X2;y)+ PeMX2;Xi\Y)+ Y PeMX2;Y\Xi) 

ees\{A" u {2}} " 

- Pe{leiX2;Y)+Ii{X2;Xi\Y)'^ 
eeA ' 

= IiXi,X2;Y)+ Y PeMX2;Xi\Y)- Y PeIe{X2;X^\Y) + Y PeIe{X2;Y\Xi) 
ees\{A:' u {2}} ^e-^ ' '^e-^ " 

- Y PtIe{X2;Y) 
eeA ' 

''^I{X^,X2]Y)+ Y PeMX2;Xi\Y)+ Y PeMX2;Y\Xi)- Y PeMX2;Y) 

ees\{A' \J A" \J {2}} ieA" eeA' 

where: 

(a) , (c) and (e) follow from the fact that Xi is independent of X2 given {U, Si, S2)', 

(b) follows from the fact that A" C 5\{2}; 

(d) follows by eliminating common factors from the last two sums; 

(f) follows from the fact that A ' C S\ IJ {2}|, and by eliminating the common factors from the two 
relevant sums. 



To conclude, applying the FME on ( 11801 ) and ( 11811 ) yields the following inequality: 



Y PeR2e + R'i + R'2- Y ^^^21 
leA " eeA ' 
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^6-^" ' ^G5\{yl' U-4" U {2}} 

(187) 

Finally, note that by adding (1175b taken for the subset A to (1178b taken for the subset A , we get: 

PeR2e + R'i+R'2~ Y ^^^2£< Y PeMX2;Y\Xi) +I{Xi,X2;Y) ~ ^ PA{X2;Y) 

which is tighter than (1187b in view of the non-negativity of the mutual information. (1187b is therefore also 
redundant. 



We hence conclude that we are left with inequalities of the forms ( I175b -( ll78b for every A C Si,2- This in 
turn implies that all the components which involve the element {2} are eliminated. By repeating Stage 2 in order 
to eliminate i?23,^24, • • • ,R2k (and their associated components), it can be shown that the only non-redundant 
inequalities are those taken for the subsets A C Si^2,...,k- However, since 5i,2....,fc = 0, its only subset is A ~ 9, 
for which the inequalities in (|175b -( fT78b reduce to: 

R'2 < I{X2; Y\X,, U, S, S, = si, §2), (188) 
R[ + i?2 < I{Xi,X2;Y\U, S, Si = si, S2). (189) 

Appendix D 

PROOF OF THE MARKOV RELATION IN f60l)-(f62h 

We prove the Markov relation (|60t-(l62t using the following claims. 
The equality in ^ follows from the facts that (Vf , V^, S'^-'^^-^) - Sg^di - Sg-d2 - Sq and (if, if, S^-'^^'^) - 
{Sq-di , Q) - SQ-d2 - Sq. 

To show (l6Tb consider the following relations 

= Y P("^l'2;i,g|Sg,Sg-di,Sg-d2,'^l,W2>s''~'''~\'?) 
mi 

= Y Pi'^i\^g^^'i-di,Sg^d2,vi,V2,mi,s'^~'^^^^,q) 

miEMi 

■ P{xi,q\Sq, Sq-d,,Sq-d2,vi,V2, TTT-l , s''"''^ ~\ (?) 
= Y ^^2: 'mi,s'^''^^~^,q)p{xi,q\Sq-d^ , , , , S«"''l " \ (?) 

= Y Pi'^i'^i,'i\^q-di,vi,V2,s'^^'^^^^,q) 



pixi,q\Sq-di,v{,V^2,s'' \ q) 
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where (a) follows from the fact that AIi is independent of 5" given {Vf, V2) and the fact that Xi^q is a deterministic 
function of (F/, F/, , S'^-'^^^^). Now, since this is true for all q, and because the auxiliary RV is defined 

as [/ = (y/, t/2^ Q), it holds that 

P{xi\s,Sl,S2,u) = P{xi\si,u). 

Finally, to show ( |62| | we use the following relations 

P{x2,q\xi^q, Sg, Sg^di , Sq-d^ , Uq , Q) = P{x2^q\xi^q, Sq, Sq-di , Sq-d2 , ^i, V^, S*^^*^^ , q) 
= X! Pi'^2,X2,q\xi,q,Sq,Sq-di,Sq-d2,vi,V2,s'^^'^^^^,q) 

■ P{x2,q\xi^q, Sq, Sq-di , Sq-d2 , ^l, V2, 1712, s'^''^^"^ , q) 

- X! P('^2|Sg-di, Sg-d2, Wl, ^2, s'i^'^^^^ , q)p{x2,q\s^-dl , Sq-d2,v{,vi, 7712, S«"''i"\ q) 
mi6Xi 

= X! P("^2,2;2,9|Sg-(ii,Sg-d2,«l,W2>s'~'''~\<7) 
= P(a;2.g|Sg-di,Sg_d2,'"9) 

where (a) follows from the fact that M2 is independent of {Xi,S") given [V^ ,¥2 ) and the fact that X2.g is 
independent of {Xi^q,Sq) given {V( ,Vj , M2, Sq-dn Sq-d2^ S'^^'^^^^). Again, since the above holds for every q 
and by the definition of U, we can conclude that 

P{x2\xi,S,Si,S2,u) = F(X2|S1,S2,W). 
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Appendix E 
PROOF OF LEMMAg] 



In order to prove the RHS of (I140l l. consider: 



(a) 



(6) 



(an",a")6ri"' 



2"2-f^(-^,"^)(l+«) . 2-"2-fi"(X)(l-e) , 2-»2ff('i')(l-e) 

^ 2"2(ff(-Y,"i')--fi"(x)-ff(y)+e(_f/(x,y)+_ff(x)+_H"(y))) 

(J 2-"2(/(X;y)+5,) (jgQ-j 

where: 

(a) follows from the fact that if G 7^^"^ then (a;"i,2/"i) € T^^"'^ and (a;"^y"^) G re*"'^ 

(b) follows from the fact that the sequences are drawn independently and the fact that the (x"^ , y"^ ) sequences are 
drawn according to Px{x)PY{y); 

(c) and (d) follow from the fact that probability is always upper bounded by 1 and from the properties of typical 
sets; 

(e) follows by denoting 5^ = e{H{X,Y) + H{X) + H{Y)). 
We thus conclude that 

P{(a;"^ y") e T;^")} < 2-"2a(^;>')+'5.) (191) 

where (J^ — > as e — > 0. 

To prove the LHS of ( lUOK we start from (b) in ( 1190) and write: 

(x"2 ,a"2)(=Ti"^-' 
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> (1 - ,5) 2-"2-^('^)(i+^) . 2-"2H(y)(i+£) 

> (1 - 5) • (1 - Se n) ■ 2"^-f^(-^''*')(l~^) • 2-"2H(X)(l+e) . 2-"2H(y)(l+£) 

(1 - (5^ „) • 2"2(ff(^>^)--ff(^)--f^(>')-E(^^(^^>')+^^(-^)+^^(^))) 

(J _ ^, J . 2-".(/(X;F)-5,) 

where: 

(a) follows from the fact that (a;"^^"^) are drawn i.i.d according to PxY{x,y) and thus Pr{x"'^,y"'^ € T^'"^"*) > 

(b) and (c) follow from the properties of typical sets; 

(d) follows from denoting S'^ „ = S^.nS — (5^ „ — S; 

(e) follows by denoting S, = e{H{X,Y) + + H{Y)). 
We conclude that 

G Te^"^} > (1 - • 2-"^(-f(^'^)^'^=) (192) 
where 5e — J' and (5^ „ — ^ as e — > 0. The proof is completed by combining ( |191t and ( |192| l. 
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